This is an archive of the discontinued LLVM Phabricator instance.

[clang-tidy] misc-assertion-count: A New Check
AbandonedPublic

Authored by lebedev.ri on May 19 2017, 10:17 AM.

Details

Summary

Allows to impose limits on the lines-per-assertions ratio for functions.

As referenced in:

Finds functions that have more than LinesThreshold lines, counts assertions
in them, and if the ratio of lines-per-assert is higher than configured,
emits a warning.

The lines-per-assert ratio can be controlled via four options, which form
a stair-step curve, with first two options controlling the static part, and
the second two controlling the dynamic part - i.e. how the expected minimal
assertion count changes with the line count.

I think, this is pretty much ready for review.
One big missing thing that i know of is an ability to also count C++ lambdas,
but i have tried and failed to do that.

Diff Detail

Repository
rL LLVM

Event Timeline

lebedev.ri created this revision.May 19 2017, 10:17 AM

I will obviously try to add lambda handling, but until then, the general idea and design should be pretty much done, so i'd love to hear any feedback about the current diff :)

alexfh requested changes to this revision.Jun 8 2017, 1:23 AM

I guess, this check should go to readability or elsewhere, but definitely not to misc.

Another big question is whether it's reasonable to set up specific ratio limits on the density of asserts. I think, density of asserts strongly depends on the nature of the code, and there is no single answer to how much asserts should be used. IIUC, neither of the recommendations you mentioned contain any quantitative measures, they just delegate the decision to the developer. I'm not saying it's impossible to find good formalization of these rules, but I'd expect some sort of analysis of existing codebases with regard to how asserts are used (not just the density of asserts, but also positioning of asserts and what is being checked by the asserts) in different types of code.

This revision now requires changes to proceed.Jun 8 2017, 1:23 AM

I guess, this check should go to readability or elsewhere, but definitely not to misc.

Hmm, misc may be a bad place for this, but i think readability is even worse fit.
The best guess would be something like hardening / security, but there is no such category.

Another big question is whether it's reasonable to set up specific ratio limits on the density of asserts. I think, density of asserts strongly depends on the nature of the code, and there is no single answer to how much asserts should be used. IIUC, neither of the recommendations you mentioned contain any quantitative measures, they just delegate the decision to the developer.

No, it is not reasonable to set up default ratio limits on the density of asserts.
That is exactly why the default params are NOP, and i even made sure that if the params are NOP, this check will not add any overhead (no PPCallback, no matchers).

Did that answer your question?

I'm not saying it's impossible to find good formalization of these rules, but I'd expect some sort of analysis of existing codebases with regard to how asserts are used (not just the density of asserts, but also positioning of asserts and what is being checked by the asserts) in different types of code.

alexfh added a comment.Jun 8 2017, 3:52 AM

I guess, this check should go to readability or elsewhere, but definitely not to misc.

Hmm, misc may be a bad place for this, but i think readability is even worse fit.
The best guess would be something like hardening / security, but there is no such category.

readability might be reasonable, since one of the most important functions of asserts is documenting invariants. The second function is enforcing invariants, but in correct code asserts are basically no-ops, and they fire only when the code is being changed and the invariants become broken.

Another big question is whether it's reasonable to set up specific ratio limits on the density of asserts. I think, density of asserts strongly depends on the nature of the code, and there is no single answer to how much asserts should be used. IIUC, neither of the recommendations you mentioned contain any quantitative measures, they just delegate the decision to the developer.

No, it is not reasonable to set up default ratio limits on the density of asserts.
That is exactly why the default params are NOP, and i even made sure that if the params are NOP, this check will not add any overhead (no PPCallback, no matchers).

Did that answer your question?

No, I'm not talking about default ratio limits. I wonder whether there's any specific combination of the options that would serve well to the needs of any real project. I suspect that the number of lines in a function alone is insufficient to determine reasonable limits on the number of asserts in it. I guess, if it's even possible to find out the number of invariants that is valuable to enforce in a certain function, a much larger set of inputs should be considered. One little example is that many functions check whether their pointer arguments are not null. Thus, a function that has no pointer arguments will naturally require fewer asserts.

The only relevant formalized rule I found, is the JPL's rule 16 you refer to (functions with more than 10 LOCs should have at least one assertion). But it doesn't go further and say how many assertions should be in a function with, say, 30 lines of code (I think, because it would be rather difficult to formalize all the different situations).

I think, this kind of a check needs some prior research (not necessarily in the sense of a printed paper, but at least a thoughtful analysis of the relevant metrics on real code bases) to back up the specific way the sufficiency of asserts is determined.

I'm not saying it's impossible to find good formalization of these rules, but I'd expect some sort of analysis of existing codebases with regard to how asserts are used (not just the density of asserts, but also positioning of asserts and what is being checked by the asserts) in different types of code.

I guess, this check should go to readability or elsewhere, but definitely not to misc.

Hmm, misc may be a bad place for this, but i think readability is even worse fit.
The best guess would be something like hardening / security, but there is no such category.

readability might be reasonable, since one of the most important functions of asserts is documenting invariants. The second function is enforcing invariants, but in correct code asserts are basically no-ops, and they fire only when the code is being changed and the invariants become broken.

Ok, got it.

Another big question is whether it's reasonable to set up specific ratio limits on the density of asserts. I think, density of asserts strongly depends on the nature of the code, and there is no single answer to how much asserts should be used. IIUC, neither of the recommendations you mentioned contain any quantitative measures, they just delegate the decision to the developer.

No, it is not reasonable to set up default ratio limits on the density of asserts.
That is exactly why the default params are NOP, and i even made sure that if the params are NOP, this check will not add any overhead (no PPCallback, no matchers).

Did that answer your question?

No, I'm not talking about default ratio limits. I wonder whether there's any specific combination of the options that would serve well to the needs of any real project. I suspect that the number of lines in a function alone is insufficient to determine reasonable limits on the number of asserts in it. I guess, if it's even possible to find out the number of invariants that is valuable to enforce in a certain function, a much larger set of inputs should be considered. One little example is that many functions check whether their pointer arguments are not null. Thus, a function that has no pointer arguments will naturally require fewer asserts.

The only relevant formalized rule I found, is the JPL's rule 16 you refer to (functions with more than 10 LOCs should have at least one assertion). But it doesn't go further and say how many assertions should be in a function with, say, 30 lines of code (I think, because it would be rather difficult to formalize all the different situations).

I think, this kind of a check needs some prior research (not necessarily in the sense of a printed paper, but at least a thoughtful analysis of the relevant metrics on real code bases) to back up the specific way the sufficiency of asserts is determined.

While might be slightly unrelated(?), there is obviously a Cyclomatic Complexity, and a Cognitive Complexity, from SonarQube.
The latter one might actually be interesting to have in readability-function-size or a separate check... Not sure if there are restrictions on the algorithm though.

I'm not saying it's impossible to find good formalization of these rules, but I'd expect some sort of analysis of existing codebases with regard to how asserts are used (not just the density of asserts, but also positioning of asserts and what is being checked by the asserts) in different types of code.

I think, this kind of a check needs some prior research (not necessarily in the sense of a printed paper, but at least a thoughtful analysis of the relevant metrics on real code bases) to back up the specific way the sufficiency of asserts is determined.

While might be slightly unrelated(?), there is obviously a Cyclomatic Complexity, and a Cognitive Complexity, from SonarQube.
The latter one might actually be interesting to have in readability-function-size or a separate check... Not sure if there are restrictions on the algorithm though.

I'll work on implementing that CognitiveComplexity as a new check (readability-function-cognitive-complexity, probably), and then we'll see if it will help here or not.

lebedev.ri abandoned this revision.Jun 21 2019, 8:48 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 21 2019, 8:48 AM