This is an archive of the discontinued LLVM Phabricator instance.

[clang-tidy] implement new check for const-correctness
AbandonedPublic

Authored by JonasToth on Apr 9 2018, 9:05 AM.

Details

Summary

This check aims to determine values and references that could be declared
as const, but are not.

The first version I am aiming at only analyzes local variables.
No fixits are emitted.

This check aims to implement CPPCG-ES. 25: Declare an object const or constexpr unless you want to modify its value later on
and HICPP-7.1.2 Use const whenever possible.

Because const-correctness includes many issues this check will be implemented
in multiple stages.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
JonasToth edited the summary of this revision. (Show Details)Apr 9 2018, 9:07 AM

@aaron.ballman, @lebedev.ri, @alexfh and everyone else, i would love to hear your opinion on my approach. i never implemented a similar check and i might miss some cases, that need to be considered.

The current state is just a proof-of-concept to see how such a check could be implemented.

It'll be good idea to have option to apply this check for pointer/references only, or include built-in types/enums.

clang-tidy/cppcoreguidelines/ConstCheck.cpp
14

Please remove empty line.

docs/ReleaseNotes.rst
60

Please move into new checks list in alphabetical order.

63

Please add short description here.

docs/clang-tidy/checks/cppcoreguidelines-const.rst
7

Preliminary documentation will be helpful.

Will be good idea to add HICPP alias.

It'll be good idea to have option to apply this check for pointer/references only, or include built-in types/enums.

Agreed. I aim at a mark handles const(default on), mark values const(default on for objects), mark pointer (const int * >> const <<) const (default off)

What do you mean by built-in types/enums? That they are configurable separate to objects?

It'll be good idea to have option to apply this check for pointer/references only, or include built-in types/enums.

Agreed. I aim at a mark handles const(default on), mark values const(default on for objects), mark pointer (const int * >> const <<) const (default off)

What do you mean by built-in types/enums? That they are configurable separate to objects?

I meant integers/floating point numbers/Boolean/enums. It may be overkill to apply this check for legacy code base in single pass.

Thank you for working on this check!

clang-tidy/cppcoreguidelines/ConstCheck.h
61

I'm guessing these should be llvm::DenseMap.

  • [Feature] implement array and method access
  • [Feature] finalize value semantic
JonasToth updated this revision to Diff 142526.Apr 14 2018, 1:56 PM
  • [Feature] start working on handle semantic
JonasToth updated this revision to Diff 142544.Apr 15 2018, 2:51 AM
  • [Misc] unify handle and value modification detection
  • [Misc] found new caveats, maybe take a look at refactoring first an require this check to produce clean compiles on llvm

@aaron.ballman @lebedev.ri The check is getting really complex. I run it over LLVM and detected some warnings, where iam not sure if they are valid or not. Its already a somewhat usable state, but its hard to determine false positives and false negatives.

For me, that false positives are worse for now, because they annoy. Its easier to add false negatives especially with user feedback. My idea is now: Take a look at refactoring the code to introduce const as fixit and iterate the check until LLVM compiles after the check fixed all missing consts. What do you think about that strategy? And could you take a look at the todo list at the top of ConstCheck.cpp. Did I miss something?

JonasToth marked an inline comment as done.Apr 15 2018, 2:56 AM

Coming from https://reviews.llvm.org/D45679, which I should have sent out much earlier to get in front of you :p

Kidding aside this check is more mature than mine and I'm happy to help incorporate mine into this one.

I do have a strong opinion that requires non-trivial refactoring of your diff though: having a standalone reusable helper function like isModified(Expr, Stmt), that checks whether Expr is (potentially) modified within Stmt, is something I believe to be quite useful:

  • performance/{ForRangeCopyCheck, UnnecessaryCopyInitialization} are using a much less sophisticated isOnlyUsedAsConst(), and can benefit from a more powerful one.
  • I would imagine things could get messier if this check expands to also check for turning member functions const: it's basically checking CxxThisExpr, being a handle, is not modified within a member function, but note there's no VarDecl for "this".
  • It's cleaner to follow (non-const) reference chains and only mark a decl as "can't be const" if everything on the chain is not modified, avoiding false negatives when a variable is bound to a never-modified non-const reference.
  • It's also cleaner to follow member access chains, e.g. "v.a.b.c = 10" modifies "v". This one is particularly tricky because the modifying behavior happens at "c = 10" while the variable we'd like to mark as "can't be const" is arbitrary layers away, and MemberExpr alone doesn't warrant marking "can't be const" because that'll introduce way too many false negatives, ... and I haven't figure out a way to write a matcher to match memberExpr(hasObjectExpression(memberExpr(hasObjectExpression(... <arbitrary layers deep> ... (VarDeclRef) ...), not to mention any layer could either member access or array subscript access.

Coming from https://reviews.llvm.org/D45679, which I should have sent out much earlier to get in front of you :p

Kidding aside this check is more mature than mine and I'm happy to help incorporate mine into this one.

I would love to, yours is more elegant :)

I do have a strong opinion that requires non-trivial refactoring of your diff though: having a standalone reusable helper function like isModified(Expr, Stmt), that checks whether Expr is (potentially) modified within Stmt, is something I believe to be quite useful:

Yes please!

performance/{ForRangeCopyCheck, UnnecessaryCopyInitialization} are using a much less sophisticated isOnlyUsedAsConst(), and can benefit from a more powerful one.

I did not think about such possibilities, but maybe other analysis task could benefit too? Warnings?

I would imagine things could get messier if this check expands to also check for turning member functions const: it's basically checking CxxThisExpr, being a handle, is not modified within a member function, but note there's no VarDecl for "this".

Using my approach, you can check if any member variable is used as non-const. Then mark this method as const, if it is not virtual.
Similar for member variables: private non-consts can be converted into consts.

It's cleaner to follow (non-const) reference chains and only mark a decl as "can't be const" if everything on the chain is not modified, avoiding false negatives when a variable is bound to a never-modified 
non-const reference.

For warning only yes. But i want automatic code transformation which will break such code if there are non-const references taken. Having it in one pass is certainly nice. With my approach multiple runs are required if the handles are marked as const.

It's also cleaner to follow member access chains, e.g. "v.a.b.c = 10" modifies "v". This one is particularly tricky because the modifying behavior happens at "c = 10" while the variable we'd like to mark as "can't be const" is arbitrary layers away, and MemberExpr alone doesn't warrant marking "can't be const" because that'll introduce way too many false negatives, ... and I haven't figure out a way to write a matcher to match memberExpr(hasObjectExpression(memberExpr(hasObjectExpression(... <arbitrary layers deep> ... (VarDeclRef) ...), not to mention any layer could either member access or array subscript access.

Does your approach work with the nesting? Maybe something recursive or hasDescendent(modifying) could do it?

Is read your comment on this check later, i first saw your new check. Maybe we should discuss only under one of both :) (which one is not important for me :))

jbcoe added a subscriber: jbcoe.Apr 16 2018, 6:04 AM
  • I would imagine things could get messier if this check expands to also check for turning member functions const: it's basically checking CxxThisExpr, being a handle, is not modified within a member function, but note there's no VarDecl for "this".

Probably this should be separate check. See also PR21981.

I would imagine things could get messier if this check expands to also check for turning member functions const: it's basically checking CxxThisExpr, being a handle, is not modified within a member function, but note there's no VarDecl for "this".

Using my approach, you can check if any member variable is used as non-const. Then mark this method as const, if it is not virtual.
Similar for member variables: private non-consts can be converted into consts.

You'll also need to check:

  • a member function calling another non-const member function -> *this can't be const
  • *this is passed as non-const reference param to a function -> *this can't be const

Also when marking decls as "can't be const" we'll have to do record separately for modifying behaviors coming from different functions.
Which of course are all possible but code will get too complex than necessary IMO.

I think member variables are a separate topic: I think we should just treat them as globals and not check on them, the same argument that they can be accessed from multiple translation unit applies to global/namespace scoped variables and class scoped variables. We can only reliably check function/block scope variables.
(reliable meaning being able to achieve zero false positives with useful level of recall, false negatives are inevitable because whenever a modifiable reference/handle escape the current block/translation unit we'll have to assume it's modified.)

It's also cleaner to follow member access chains, e.g. "v.a.b.c = 10" modifies "v". This one is particularly tricky because the modifying behavior happens at "c = 10" while the variable we'd like to mark as "can't be const" is arbitrary layers away, and MemberExpr alone doesn't warrant marking "can't be const" because that'll introduce way too many false negatives, ... and I haven't figure out a way to write a matcher to match memberExpr(hasObjectExpression(memberExpr(hasObjectExpression(... <arbitrary layers deep> ... (VarDeclRef) ...), not to mention any layer could either member access or array subscript access.

Does your approach work with the nesting? Maybe something recursive or hasDescendent(modifying) could do it?

Yes my approach is doing multi-pass matching by calling isModified() recursively. I consider the multi-pass matching "necessary evil" because otherwise we'll have too many false negatives.
I thought about hasDescendent (hasAncestor actually) but I don't think that makes things easier: varDeclRef(hasAncestor(modifying)) would match both "v.a.b.c. = 10" and "map[v] = 10" and we'll still need to double check the modifying behavior does link back to the particular varDeclRef.

Reviewers, what are your thoughts about the approaches?

  • I would imagine things could get messier if this check expands to also check for turning member functions const: it's basically checking CxxThisExpr, being a handle, is not modified within a member function, but note there's no VarDecl for "this".

Probably this should be separate check. See also PR21981.

Sure, a separate check sounds better.
Which makes even strong argument of having a reusable utility checking whether something is modified that can be shared between different checks :)

You'll also need to check:

  • a member function calling another non-const member function -> *this can't be const
  • *this is passed as non-const reference param to a function -> *this can't be const

Yes. That is similar behaviour to checking if a function can be noexcept. But this is all future stuff :)

Also when marking decls as "can't be const" we'll have to do record separately for modifying behaviors coming from different functions.
Which of course are all possible but code will get too complex than necessary IMO.

I cant follow you on that one. I consider every path that allows future modification (like taking a non-const reference/pointer) as cant be const. That would be enough, wouldnt it?
A separate function can only modify a variable if it has some form of non-const handle to it, which must have been created at some point.

I think member variables are a separate topic: I think we should just treat them as globals and not check on them, the same argument that they can be accessed from multiple translation unit applies to global/namespace scoped variables and class scoped variables. We can only reliably check function/block scope variables.
(reliable meaning being able to achieve zero false positives with useful level of recall, false negatives are inevitable because whenever a modifiable reference/handle escape the current block/translation unit we'll have to assume it's modified.)

You are probably right. The only zero-false positive case is: "only const methods called". One could split implementation of a class into several translation units and render the analysis approach useless.

Yes my approach is doing multi-pass matching by calling isModified() recursively. I consider the multi-pass matching "necessary evil" because otherwise we'll have too many false negatives.
I thought about hasDescendent (hasAncestor actually) but I don't think that makes things easier: varDeclRef(hasAncestor(modifying)) would match both "v.a.b.c. = 10" and "map[v] = 10" and we'll still need to double check the modifying behavior does link back to the particular varDeclRef.

Example as reference for later: https://godbolt.org/g/cvhoUN
I will add such cases to my tests.

@shuaiwang Are you in IRC from time to time? I think it will be better to discuss changes in chat, when i start to introduce your approach here.

isModified(Expr, Stmt), that checks whether Expr is (potentially) modified within Stmt, is something I believe to be quite useful:

What i understand from your comments and code:

  • implement a utility taking DeclRefExpr and check if there is modification in some scope (Stmt, e.g. CompoundStmt) -> isModified or cantBeModifiedWithin
  • match all relevant non-const variable declarations as potential candidates
  • track all uses of the candidates and check for modification in their scope

One Note: We could include variables in unnamed namespaces and static declared variables. They have TU scope.

My deviations:

  • a variable should be considered modified if a non-const handle is create from it, even if that handle does not modify its referenced value. As first step, those handles should be marked const, then the original value can be marked as const. That is required to produce compiling code after potential code-transformation.

I've updated https://reviews.llvm.org/D45679 and I think the isModified() function there is now sufficiently covering the cases you've covered here and can be used as a good starting version if you plan to use it here.
I copied your const-values test cases and it now passes with the following differences:

  • All unused local variables removed, my check will issue a warning for them because technically they're not modified, but I understand why you don't want to cover them. I don't feel strongly about it and I removed it from my copy-pasted test cases because I just to demonstrate isModified()
  • Better recall in some_reference_taking, both np_local0 and r1_np_local0 can be caught, similar for np_local1 and non_const_ref in handle_from_array.
  • direct_class_access np_local5 triggers with my check (shouldn't it?)
  • In range_for non-const loop variables triggers with my check (shouldn't they?)
  • In range_for local arrays doesn't trigger with my check, because I'm treating ArrayToPointerDecay the same as taking address of a variable, and looping over an array would involve ArrayToPointerDecay when the implicit __begin/__end are initialized. I added a note inside isModified for future improvements.
  • My check over triggers for template (with my failed attempt to fix), but I think it's more of some mistake in the check itself rather than in isModified

Also when marking decls as "can't be const" we'll have to do record separately for modifying behaviors coming from different functions.
Which of course are all possible but code will get too complex than necessary IMO.

I cant follow you on that one. I consider every path that allows future modification (like taking a non-const reference/pointer) as cant be const. That would be enough, wouldnt it?
A separate function can only modify a variable if it has some form of non-const handle to it, which must have been created at some point.

Sorry about the confusion.
Basically consider this example:

class Foo {
public:
  void a() { x = 10; }
  void b() { // nothing }
  void c() { a(); }
  void d() { b(); }
  
private:
  int x;
};

If we check whether isModified(dereference(cxxThisExpr())) for each CompondStmt(hasParent(functionDecl())), we would conclude that:

  • a() should stay non-const, because there's this->x = 10 modifying *this
  • b() should be changed to const, because nothing modifies *this
  • c() should stay non-const, because we're invoking non-const member function on this
  • d() should also stay non-const, with the same reason for c(). (We can in theory be smarter about this one because the definition of b() happens to be inside the same TU but that's out of scope right now)

However if we use a per-TU map to record whether x can be const, we'll conclude that x is modified thus can't be const, missing the one that b() is not modifying x. To fix that we'll need two-layered map recording that x is only modified in a() and potentially modified indirectly from c() and d(), so that in the end we can figure out that b() is safe to be marked const.

Anyway, all I was trying to say is: let's use the isModified() approach as it's simpler & cleaner for future use cases like adding const to member functions. And it feels to me that we've already reached agreement there :)

@shuaiwang Are you in IRC from time to time? I think it will be better to discuss changes in chat, when i start to introduce your approach here.

Not really, but I can get myself familiar with it.

isModified(Expr, Stmt), that checks whether Expr is (potentially) modified within Stmt, is something I believe to be quite useful:

What i understand from your comments and code:

  • implement a utility taking DeclRefExpr and check if there is modification in some scope (Stmt, e.g. CompoundStmt) -> isModified or cantBeModifiedWithin

It doesn't have to be DeclRefExpr, any Expr should work, and it's useful to make it accept any Expr:

  • It can be used to check whether dereference(cxxThisExpr()) is modified or not.
  • For pointer types, it can be used to check both declRefExpr(isPointerType()) and dereference(declRefExpr(isPointerType())), and suggest adding const at different level
  • match all relevant non-const variable declarations as potential candidates
  • track all uses of the candidates and check for modification in their scope

One Note: We could include variables in unnamed namespaces and static declared variables. They have TU scope.

Great catch!

My deviations:

  • a variable should be considered modified if a non-const handle is create from it, even if that handle does not modify its referenced value. As first step, those handles should be marked const, then the original value can be marked as const. That is required to produce compiling code after potential code-transformation.

I believe we can issue warning + fixit within one pass:

int f() {
  int a = 10;
  int& b = a;
  return b;
}

We should be able to issue warnings for both a and b, because b itself is a varDecl without modifying behavior, and when checking a we'll just repeat a bit more work that was already done when checking b by following the reference chain to be able to find that a can be const as well.

I've updated https://reviews.llvm.org/D45679 and I think the isModified() function there is now sufficiently covering the cases you've covered here and can be used as a good starting version if you plan to use it here.
I copied your const-values test cases and it now passes with the following differences:

  • All unused local variables removed, my check will issue a warning for them because technically they're not modified, but I understand why you don't want to cover them. I don't feel strongly about it and I removed it from my copy-pasted test cases because I just to demonstrate isModified()

Nice!

  • Better recall in some_reference_taking, both np_local0 and r1_np_local0 can be caught, similar for np_local1 and non_const_ref in handle_from_array.

Yes. But i feel we need to gather more information. For potential checks, it might be nice to control to which scope modification is defined and/or emit notes for interesting events like taking a non-const handle which isnt modified, too.

It kinda feels like we are converging to a dependency graph for every variable that contains the information for dependencies and dependents + modification history. I dont think it is bad, but maybe complicated :)
What do you think about returning a ModificationRecord that contains the number of direct modification, the number of local non-const handles and escaping non-const handles.
Including const handles would be nice in the future for more and different analysis.

  • direct_class_access np_local5 triggers with my check (shouldn't it?)

Yes can be const: Godbolt

  • In range_for non-const loop variables triggers with my check (shouldn't they?)
  • If the loop copies every value no modification occurs (its not a copy, its initialization. If the constructor would modify its value, the source cant be const either. Is it allowed?)
  • if a handle is taken the usual rules apply.
  • In range_for local arrays doesn't trigger with my check, because I'm treating ArrayToPointerDecay the same as taking address of a variable, and looping over an array would involve ArrayToPointerDecay when the implicit __begin/__end are initialized. I added a note inside isModified for future improvements.

Given this is implemented here, can you use it?

  • My check over triggers for template (with my failed attempt to fix), but I think it's more of some mistake in the check itself rather than in isModified

I removed templates while matching for potential candidates. That can be moved out of the isModified(). isModified() can only reason about instantiated functions. For the actual template we need concepts.

Sorry about the confusion.
Basically consider this example:

class Foo {
public:
  void a() { x = 10; }
  void b() { // nothing }
  void c() { a(); }
  void d() { b(); }
  
private:
  int x;
};

If we check whether isModified(dereference(cxxThisExpr())) for each CompondStmt(hasParent(functionDecl())), we would conclude that:

  • a() should stay non-const, because there's this->x = 10 modifying *this
  • b() should be changed to const, because nothing modifies *this
  • c() should stay non-const, because we're invoking non-const member function on this
  • d() should also stay non-const, with the same reason for c(). (We can in theory be smarter about this one because the definition of b() happens to be inside the same TU but that's out of scope right now)

However if we use a per-TU map to record whether x can be const, we'll conclude that x is modified thus can't be const, missing the one that b() is not modifying x. To fix that we'll need two-layered map recording that x is only modified in a() and potentially modified indirectly from c() and d(), so that in the end we can figure out that b() is safe to be marked const.
Anyway, all I was trying to say is: let's use the isModified() approach as it's simpler & cleaner for future use cases like adding const to member functions. And it feels to me that we've already reached agreement there :)

Yes.

@shuaiwang Are you in IRC from time to time? I think it will be better to discuss changes in chat, when i start to introduce your approach here.

Not really, but I can get myself familiar with it.

We dont need to chat. The review discussion works well. :)

isModified(Expr, Stmt), that checks whether Expr is (potentially) modified within Stmt, is something I believe to be quite useful:

What i understand from your comments and code:

  • implement a utility taking DeclRefExpr and check if there is modification in some scope (Stmt, e.g. CompoundStmt) -> isModified or cantBeModifiedWithin

It doesn't have to be DeclRefExpr, any Expr should work, and it's useful to make it accept any Expr:

  • It can be used to check whether dereference(cxxThisExpr()) is modified or not.
  • For pointer types, it can be used to check both declRefExpr(isPointerType()) and dereference(declRefExpr(isPointerType())), and suggest adding const at different level
  • match all relevant non-const variable declarations as potential candidates
  • track all uses of the candidates and check for modification in their scope

One Note: We could include variables in unnamed namespaces and static declared variables. They have TU scope.

Great catch!

My deviations:

  • a variable should be considered modified if a non-const handle is create from it, even if that handle does not modify its referenced value. As first step, those handles should be marked const, then the original value can be marked as const. That is required to produce compiling code after potential code-transformation.

I believe we can issue warning + fixit within one pass:

int f() {
  int a = 10;
  int& b = a;
  return b;
}

We should be able to issue warnings for both a and b, because b itself is a varDecl without modifying behavior, and when checking a we'll just repeat a bit more work that was already done when checking b by following the reference chain to be able to find that a can be const as well.

We can certainly try it. Speaking from previous experience: There is always some weird way to not catch everything and to get false positives in edge cases, thats why i tried the conservative way first. But i was stumbling around, too. I think the new function can give us more confidence, because its easier to test in isolation.

One interesting thing i found in code: int &some_var = condition ? var1 : var2. I am not sure if we already find that as reference taking :)

JonasToth updated this revision to Diff 142988.Apr 18 2018, 1:17 PM
  • [Feature] refactor check to use a utility function to determine constness

@shuaiwang I implemented the check on top of you utility function. It does fail right now (concentrating on values only for now). I added a FIXME for all false positives/negatives for the values test cases.

JonasToth updated this revision to Diff 142991.Apr 18 2018, 1:34 PM
  • [Misc] mark false positives
  • [Feature] use new statefull const checker
  • [Test] add more tests finding false positives
shuaiwang added inline comments.Apr 29 2018, 11:19 AM
clang-tidy/cppcoreguidelines/ConstCheck.cpp
230–238

I think we need to loop over usages instead of just checking the first, i.e.:

for (const auto &Nodes : Usage) {
  const auto* UseExpr = Nodes.getNodeAs<Expr>("use");
  if (UseExpr && isMutated(UseExpr)) return true;
}

I'll add a helper function in the MutationAnalyzer for checking varDecl directly as well per your comment there, which you'll be able to use directly in this check. Before that's ready (and if you have time of course) could you help check how many false positives are left with this change?

I will migrate to the new API today evening (european time).

Why do you think that looping is required? From my understanding, we
need the first usage (DeclRefExpr) to get the analysis started. The
analysis itself will check all remaining uses. Is this necessary,
because we analysis on a Expr basis?

Am 29.04.2018 um 20:19 schrieb Shuai Wang via Phabricator:

shuaiwang added inline comments.

Comment at: clang-tidy/cppcoreguidelines/ConstCheck.cpp:229-237
+ const auto *UseExpr = selectFirst<Expr>("use", Usage);
+
+ The declared variables was used in non-const conserving way and can not
+
be declared as const.
+ if (UseExpr && Scopes[Scope]->isMutated(UseExpr)) {
+ // diag(UseExpr->getLocStart(), "Investigating starting with this use",

+ // DiagnosticIDs::Note);

I think we need to loop over usages instead of just checking the first, i.e.:

for (const auto &Nodes : Usage) {
  const auto* UseExpr = Nodes.getNodeAs<Expr>("use");
  if (UseExpr && isMutated(UseExpr)) return true;
}

I'll add a helper function in the MutationAnalyzer for checking varDecl directly as well per your comment there, which you'll be able to use directly in this check. Before that's ready (and if you have time of course) could you help check how many false positives are left with this change?

Repository:

rCTE Clang Tools Extra

https://reviews.llvm.org/D45444

  • migrate to Decl interface
  • add template metaprogramming test
  • fix bad template instantiation

Why do you think that looping is required? From my understanding, we
need the first usage (DeclRefExpr) to get the analysis started. The
analysis itself will check all remaining uses. Is this necessary,
because we analysis on a Expr basis?

Yes. the analyzer traces starting from the given DeclRefExpr, e.g.:

int a; // <- varDecl
const int& b = a;  // <- first DeclRefExpr
int& c = a; // <- second DeclRefExpr

const int& b2 = b;
int& c2 = c;
c2 = 10;

If we start from the first DeclRefExpr, we'll only trace to b and then to b2. We need to start from varDecl (or loop over all DeclRefExpr) to be able to trace to the second DeclRefExpr from where we can get to c -> c2 -> c2 = 10.

I see. Thank you for fixing + explaining :)

Am 30.04.2018 um 23:39 schrieb Shuai Wang via Phabricator:

shuaiwang added a comment.

Why do you think that looping is required? From my understanding, we
need the first usage (DeclRefExpr) to get the analysis started. The
analysis itself will check all remaining uses. Is this necessary,
because we analysis on a Expr basis?

Yes. the analyzer traces starting from the given DeclRefExpr, e.g.:

int a; // <- varDecl
const int& b = a;  // <- first DeclRefExpr
int& c = a; // <- second DeclRefExpr

const int& b2 = b;
int& c2 = c;
c2 = 10;

If we start from the first DeclRefExpr, we'll only trace to b and then to b2. We need to start from varDecl (or loop over all DeclRefExpr) to be able to trace to the second DeclRefExpr from where we can get to c -> c2 -> c2 = 10.

Repository:

rCTE Clang Tools Extra

https://reviews.llvm.org/D45444

JonasToth updated this revision to Diff 153267.Jun 28 2018, 1:19 AM
  • rebase on commited ExpMutationAnalyzer
  • clean up, documentation
JonasToth retitled this revision from [clang-tidy] WIP: implement new check for const-correctness to [clang-tidy] implement new check for const-correctness.Jun 28 2018, 1:22 AM
JonasToth updated this revision to Diff 153268.Jun 28 2018, 1:26 AM
JonasToth marked 4 inline comments as done.
  • [Misc] order in release notes
  • fixed some comments
JonasToth updated this revision to Diff 153282.Jun 28 2018, 2:36 AM
  • Merge branch 'master' into check_const

Ping, i guess :)
Anything needed to get this going?

JonasToth added a comment.EditedJul 14 2018, 7:34 AM

Time :/

But review comments are still appreciated :)

Am 14.07.2018 um 16:34 schrieb Roman Lebedev via Phabricator:

lebedev.ri added a comment.

Ping, i guess :)
Anything needed to get this going?

Repository:

rCTE Clang Tools Extra

https://reviews.llvm.org/D45444

Time :/

Is this available as a public Git branch somewhere, or is downloading the diff the preferred way to interact with this code?

Yes.

https://github.com/JonasToth/clang-tools-extra/tree/check_const

This is the branch i work on. I got it up to date with the current
master for CTE. :)

Am 14.07.2018 um 20:59 schrieb Florin Iucha via Phabricator:

0x8000-0000 added a comment.

In https://reviews.llvm.org/D45444#1162715, @JonasToth wrote:

Time :/

Is this available as a public Git branch somewhere, or is downloading the diff the preferred way to interact with this code?

Repository:

rCTE Clang Tools Extra

https://reviews.llvm.org/D45444

@0x8000-0000 are you interested in working on the check? If i recall correctly the branch for 7.0 will happen on 1. august. That would be the timewindow in which it makes sense to give you the patch. I will have time in september following, but not right now.
I try to commit 2 bug fixes in this time, because they are almost done already.

@aaron.ballman What do you think is missing for this check? Do you think it is doable until 1. august?

JonasToth marked an inline comment as done.Jul 17 2018, 1:24 AM

Removed one unresolved comment, probably thats why it was stuck :D

JonasToth updated this revision to Diff 155821.Jul 17 2018, 1:24 AM
  • Merge branch 'master' into check_const
  • [Misc] remove comment and iostream

@0x8000-0000 are you interested in working on the check? If i recall correctly the branch for 7.0 will happen on 1. august. That would be the timewindow in which it makes sense to give you the patch. I will have time in september following, but not right now.
I try to commit 2 bug fixes in this time, because they are almost done already.

@JonasToth
I am interested in helping - is there any area you'd like me to look at? I have reviewed the test cases and they seem fine. The code is a bit more intimidating, given that I just started looking at clang/llvm internals. But I have pulled the patch into my personal build at home and at ${BIG_CO} and I am using the check "in production". I have detected no crashes and no false positives yet.

What do you mean by "The code is a bit more intimidating", the check
itself or the amount of tests?

In general the utility piece that was commited before this check should
already analyze the constness good, given that this check builds upon it
the functionality should be ok.

In my opinion there is nothing major missing, but the reviewers (usually
@aaron.ballman) have to judge that. After that i would come back to you.

If you have a run on a codebase, you could attach logfile to see how
noisy the check is itself. :)

If you do have time, there are always bugs to fix :P

But you could also take a look at the enforcement of some coding
guidelines (my personal todolists:
https://github.com/JonasToth/CppCoreGuidelinesTooling 
https://github.com/JonasToth/HighIntegrityTooling). There are always low
hanging fruit checks. Implementing something there helps getting used to
the codebase of clang, too.

Am 17.07.2018 um 12:41 schrieb Florin Iucha via Phabricator:

@JonasToth
I am interested in helping - is there any area you'd like me to look at? I have reviewed the test cases and they seem fine. The code is a bit more intimidating, given that I just started looking at clang/llvm internals. But I have pulled the patch into my personal build at home and at ${BIG_CO} and I am using the check "in production". I have detected no crashes and no false positives yet.

Repository:

rCTE Clang Tools Extra

https://reviews.llvm.org/D45444

But you could also take a look at the enforcement of some coding
guidelines (my personal todolists:
https://github.com/JonasToth/CppCoreGuidelinesTooling 
https://github.com/JonasToth/HighIntegrityTooling). There are always low
hanging fruit checks. Implementing something there helps getting used to
the codebase of clang, too.

You may be also interested to look onto enhancements requests for Clang-tidy in http://bugs.llvm.org.

Any plans for fix-its that will add the suggested 'const' qualifiers?

Yes, but that is complicated. Functionality for that will first be
implemented in the libTooling and then utilized here later.

Am 19.07.2018 um 04:16 schrieb Florin Iucha via Phabricator:

0x8000-0000 added a comment.

Any plans for fix-its that will add the suggested 'const' qualifiers?

Repository:

rCTE Clang Tools Extra

https://reviews.llvm.org/D45444

@aaron.ballman Are you missing something in this check/tests? I think i will remove the big comment with my thoughts on what has to be done, or do you think it would help for reference?

I would probably be able to finalize this week if there is no major issue, so it could be committed for the release.

JonasToth updated this revision to Diff 158285.Jul 31 2018, 8:34 AM
  • Merge branch 'master' into check_const
JonasToth updated this revision to Diff 159062.Aug 3 2018, 11:48 AM

get check up to 8.0

I'm sorry for the delay in reviewing this; I'm not certain how it fell off my radar for so long!

clang-tidy/cppcoreguidelines/ConstCheck.cpp
33

Do you intend to support Obj-C object pointers as well?

142–143

Why is this check disabled for C code?

150

Drop the comma; that -> which

153–154

Can this use unless(anyOf(...)) instead?

165–166

No need for this -- check() won't be called unless there are registered matchers.

168

Can you pick a slightly different name -- Scope is the name of a type in the clang namespace.

191

const -> 'const'

clang-tidy/cppcoreguidelines/ConstCheck.h
26

Grammar nit, perhaps: This check warns on variables which could be declared const but are not.

clang-tidy/cppcoreguidelines/CppCoreGuidelinesTidyModule.cpp
44

This name leaves a bit to be desired. How about cppcoreguidelines-const-correctness?

docs/clang-tidy/checks/cppcoreguidelines-const.rst
7

that -> which

9

coding guidelines for example -> coding guidelines, such as:

13

type based -> type-based

I'm sorry for the delay in reviewing this; I'm not certain how it fell off my radar for so long!

No problem :)

JonasToth added inline comments.Aug 5 2018, 8:21 AM
clang-tidy/cppcoreguidelines/ConstCheck.cpp
33

For now not, because I have no experience nor knowledge with Obj-C.

xbolva00 added inline comments.
clang-tidy/cppcoreguidelines/ConstCheck.cpp
29

typo
paramters -> parameters

JonasToth updated this revision to Diff 159220.Aug 5 2018, 8:48 AM
JonasToth marked 11 inline comments as done.
  • Merge branch 'master' into check_const
  • [Misc] rename and first review comments
  • language stuff
JonasToth added inline comments.Aug 5 2018, 8:48 AM
clang-tidy/cppcoreguidelines/ConstCheck.cpp
142–143

No actual reason. I will allow it for C, too.

JonasToth updated this revision to Diff 159221.Aug 5 2018, 8:51 AM
  • doc list
JonasToth updated this revision to Diff 159224.Aug 5 2018, 9:14 AM
  • revert breaking change in LocalVar matcher.

Reverting the change from allOf(...) to unless(anyOf(..)). I did not investigate
the reason for it breaking, because basic logic suggests that transformation
should be correct.

JonasToth added inline comments.Aug 5 2018, 9:15 AM
clang-tidy/cppcoreguidelines/ConstCorrectnessCheck.cpp
147 ↗(On Diff #159220)

@aaron.ballman The change was not valid for some reason. I leave it like it is if thats ok with you.

JonasToth updated this revision to Diff 159225.Aug 5 2018, 9:19 AM
JonasToth marked an inline comment as done.
  • fix typo
JonasToth marked an inline comment as done.Aug 5 2018, 9:20 AM
JonasToth updated this revision to Diff 159226.Aug 5 2018, 9:27 AM
  • update ReleaseNotes

The functionality is looking good, aside from a few small nits remaining. However, I'm wondering how this should integrate with other const-correctness efforts like readability-non-const-parameter? Also, I'm wondering how this check performs over a large code base like LLVM -- how chatty are the diagnostics, and how bad is the false positive rate (roughly)?

clang-tidy/cppcoreguidelines/ConstCheck.cpp
33

Okay, then please add a comment mentioning that they're explicitly not handled yet (perhaps with a FIXME).

clang-tidy/cppcoreguidelines/ConstCorrectnessCheck.cpp
183 ↗(On Diff #159226)

Still missing the single quotes around const in the diagnostic.

147 ↗(On Diff #159220)

That's.... really odd. I am fine leaving it as-is for this patch, but it would be good to understand why that code fails as it seems like a reasonable exposition.

JonasToth marked 3 inline comments as done.Aug 6 2018, 1:57 PM
JonasToth marked an inline comment as done.Aug 6 2018, 2:01 PM
JonasToth added inline comments.
clang-tidy/cppcoreguidelines/ConstCorrectnessCheck.cpp
147 ↗(On Diff #159220)

Added a TODO. But maybe i did the transformation incorrectly too. Not all conditions are negative. Maybe all the negative ones can be inverted by demorgan. That should be correct.

I will check this out later.

JonasToth marked an inline comment as done.Aug 6 2018, 2:02 PM
JonasToth added inline comments.
clang-tidy/cppcoreguidelines/ConstCorrectnessCheck.cpp
183 ↗(On Diff #159226)

Ups. The comment has them :D

JonasToth updated this revision to Diff 159390.Aug 6 2018, 2:03 PM
JonasToth marked 4 inline comments as done.
  • address review issues, todos/fixmes and diag nit

However, I'm wondering how this should integrate with other const-correctness efforts like readability-non-const-parameter?

I think this check/functionality will kinda replace the readability-non-const-parameter check. The readability check does not a full const-analysis too and i think only works on pointers or sth like this.
Maybe the check name will still exist, but use the ExprMutAnalyzer or it will become an alias to this with special configuration.
I would like to add support for marking methods const plus the ability for code transformation. Currently looking into clang-refactor framework to implement general refactoring primitives necessary for that.
In general its probably better to have one check, that handles all const issues.

Also, I'm wondering how this check performs over a large code base like LLVM -- how chatty are the diagnostics, and how bad is the false positive rate (roughly)?

I will prepare a report for this tomorrow. Currently the LLVM builds take very long on my laptop :(

For reference, here is the output for llvm/lib.

Things i noticed:

  • lambdas are warned as potential const, i think they should be excluded for the values
  • for-loops that initialize two values (usually the start and end-iterator) are correctly diagnosed, but the diagnosis might be misleading. Especially because there is no way to have a const-iterator initialized together with the changing iterator.

for (auto begin = vec.begin(), end = vec.end(); ...)

  • there seems to be a false positive with array-to-pointer decay. ExprMutAnalyzer does think of it, but maybe there is a bug in it.
  • pointer diagnostics don't seem to worker (pointer-as-value). The test-case does work, This must be analyzed further
  • there was a bug in the check method, where AnalyzeValues == 0 did imply no analysis is done.

Given the volume of diagnostics for the value-case i did not investigate many cases manually. I think it makes sense to strive for code-transformation and check if the code still compiles.

References on the other hand seem to function correctly, and most references are single-var-decls, too. That means code transformation is very simple and therefor an easy target to check functionality.
References and values are in general treated the same, so it might give a hint about the performance for values.

  • there seems to be a false positive with array-to-pointer decay. ExprMutAnalyzer does think of it, but maybe there is a bug in it.

Could you give a concrete example of this?

Could you give a concrete example of this?

vi llvm/lib/Demangle/ItaniumDemangle.cpp +1762

/home/jonas/opt/llvm/lib/Demangle/ItaniumDemangle.cpp:1762:7: warning:
variable 'num' of type 'char [FloatData<Float>::max_demangled_size]' can
be declared 'const' [cppcoreguidelines-const-correctness]
      char num[FloatData<Float>::max_demangled_size] = {0};
     ^
/home/jonas/opt/llvm/lib/Demangle/ItaniumDemangle.cpp:1763:7: warning:
variable 'n' of type 'int' can be declared 'const'
[cppcoreguidelines-const-correctness]
      int n = snprintf(num, sizeof(num), FloatData<Float>::spec, value);
     ^

JonasToth updated this revision to Diff 159775.Aug 8 2018, 1:33 PM
  • fix bug with AnalyzeValues==false skipping whole check, adjust test code to 'const' instead of const

Could you give a concrete example of this?

vi llvm/lib/Demangle/ItaniumDemangle.cpp +1762

/home/jonas/opt/llvm/lib/Demangle/ItaniumDemangle.cpp:1762:7: warning:
variable 'num' of type 'char [FloatData<Float>::max_demangled_size]' can
be declared 'const' [cppcoreguidelines-const-correctness]
      char num[FloatData<Float>::max_demangled_size] = {0};
     ^
/home/jonas/opt/llvm/lib/Demangle/ItaniumDemangle.cpp:1763:7: warning:
variable 'n' of type 'int' can be declared 'const'
[cppcoreguidelines-const-correctness]
      int n = snprintf(num, sizeof(num), FloatData<Float>::spec, value);
     ^

Looks like related to template.
the exact type of num depends on Float
and the CallExpr is not even resolved because we don't know the type of the arguments yet (adl, overload resolution, etc.)
So the AST doesn't contain any array to pointer decay.

I guess we should suppress the check in such cases.

Always the same with the templates ;) So uninstantiated templates should
just be ignored.

I think it would be better to have it in the ExprMutAnalyzer, because
that part can not decide on const-ness. Fixing it here would just
circumvent the problem but not fix it, would you agree?

Am 10.08.2018 um 22:12 schrieb Shuai Wang via Phabricator:

shuaiwang added a comment.

In https://reviews.llvm.org/D45444#1191874, @JonasToth wrote:

lCould you give a concrete example of this?

vi llvm/lib/Demangle/ItaniumDemangle.cpp +1762

/home/jonas/opt/llvm/lib/Demangle/ItaniumDemangle.cpp:1762:7: warning:
variable 'num' of type 'char [FloatData<Float>::max_demangled_size]' can
be declared 'const' [cppcoreguidelines-const-correctness]
      char num[FloatData<Float>::max_demangled_size] = {0};
     ^
/home/jonas/opt/llvm/lib/Demangle/ItaniumDemangle.cpp:1763:7: warning:
variable 'n' of type 'int' can be declared 'const'
[cppcoreguidelines-const-correctness]
      int n = snprintf(num, sizeof(num), FloatData<Float>::spec, value);
     ^

Looks like related to template.
the exact type of num depends on Float
and the CallExpr is not even resolved because we don't know the type of the arguments yet (adl, overload resolution, etc.)
So the AST doesn't contain any array to pointer decay.

I guess we should suppress the check in such cases.

Repository:

rCTE Clang Tools Extra

https://reviews.llvm.org/D45444

JonasToth updated this revision to Diff 160239.Aug 11 2018, 3:51 AM
  • explicitly ignore lambdas in VarDecls

Always the same with the templates ;) So uninstantiated templates should
just be ignored.

I think it would be better to have it in the ExprMutAnalyzer, because
that part can not decide on const-ness. Fixing it here would just
circumvent the problem but not fix it, would you agree?

Agreed :)
I'll create a diff handling various template related cases in ExprMutationAnalyzer.

Thank you very much :)

Am 12.08.2018 um 00:17 schrieb Shuai Wang via Phabricator:

shuaiwang added a comment.

In https://reviews.llvm.org/D45444#1196271, @JonasToth wrote:

Always the same with the templates ;) So uninstantiated templates should
just be ignored.

I think it would be better to have it in the ExprMutAnalyzer, because
that part can not decide on const-ness. Fixing it here would just
circumvent the problem but not fix it, would you agree?

Agreed :)
I'll create a diff handling various template related cases in ExprMutationAnalyzer.

Repository:

rCTE Clang Tools Extra

https://reviews.llvm.org/D45444

  • get up to date
JonasToth updated this revision to Diff 164898.Sep 11 2018, 8:48 AM
  • ignore lambdas properly


Here are results for llvm/lib with the current version.

I could not find any outstanding issues right now. I feel that the check needs transformation capabilties. Then i would try to produce invalid const transformations to find false positives. Spoting false negatives is harder though.

JonasToth updated this revision to Diff 165497.Sep 14 2018, 7:02 AM
  • update to ExprMutAnalyzer living in clang now

Yeah, it would be super useful if Clang can add const to all places where possible :) love this work, great!

Soonish it might be able to do so ;)

Am 14.09.2018 um 17:13 schrieb Dávid Bolvanský via Phabricator:

xbolva00 added a comment.

Yeah, it would be super useful if Clang can add const to all places where possible :) love this work, great!

Repository:

rCTE Clang Tools Extra

https://reviews.llvm.org/D45444

  • fix actually use clang-analyses correctly

Are you still looking at this?

I am, but this revision relies on support on the analysis in the
frontend. This seems currently stalled as the developer there seems busy
with other stuff.

Once i implement transofmration-capabilites i will evaluate if there are
any false-positives that would break. Once that is done it might still
land even if there is no frontend progress.

Right now we are aware of false-negatives in the const-ness detection,
that should not influence this check though. And of course obscure
(template) cases might still be buggy.

Am 31.10.18 um 20:33 schrieb Dávid Bolvanský via Phabricator:

xbolva00 added a comment.

Are you still looking at this?

Repository:

rCTE Clang Tools Extra

https://reviews.llvm.org/D45444

JonasToth updated this revision to Diff 173653.Nov 12 2018, 4:33 AM
  • Merge branch 'master' into check_const
JonasToth updated this revision to Diff 174942.Nov 21 2018, 9:17 AM
  • Merge branch 'master' into check_const
JonasToth updated this revision to Diff 175445.Nov 27 2018, 3:45 AM
  • Merge branch 'master' into check_const
  • avoid bitrot

Would this warn with stuff like

double borderWidth;
[... code that doesn't use borderWidth ...]
borderWidth = border->getWidth(); 
[... code that reads borderWidth ...]
Herald added a project: Restricted Project. · View Herald TranscriptSep 27 2019, 8:13 AM
Herald added a subscriber: wuzish. · View Herald Transcript

Would this warn with stuff like

double borderWidth;
[... code that doesn't use borderWidth ...]
borderWidth = border->getWidth(); 
[... code that reads borderWidth ...]

Warn in what sense? That borderWidth could be initialized early and then be const?
No. It would only suggest const (if configured to make values const) when borderWidth is initialized and then untouched.

JonasToth updated this revision to Diff 229818.Nov 18 2019, 5:42 AM
  • update license
  • rebase to master and patch for adding const
0x8000-0000 added a comment.EditedDec 2 2019, 3:41 PM

Thank you for rebasing on current master.

I have ran it today on our code base and found three classes of false positives:

  1. Writing to a bitfield of a struct. The struct still is suggested it should be const.
  2. Using a variable with an ostream extraction; like "int foo; cin >> foo;", except it was a template on the stream instance.
  3. In a for loop, with a somewhat strange pair (for auto [foo, bar] = std::pair {}; foo < big_foo; ++ foo).

Let me know if you can't create test cases from these descriptions and I can try to extract the code.

Thank you for rebasing on current master.

I have ran it today on our code base and found three classes of false positives:

  1. Writing to a bitfield of a struct. The struct still is suggested it should be const.
  2. Using a variable with an ostream extraction; like "int foo; cin >> foo;", except it was a template on the stream instance.
  3. In a for loop, with a somewhat strange pair (for auto [foo, bar] = std::pair {}; foo < big_foo; ++ foo).

Let me know if you can't create test cases from these descriptions and I can try to extract the code.

Thank you for testing, I would appreciate if you test later versions, too!
I will focus on landing the utility function for adding const first. I noticed false positives in LLVM as well, I try to reproduce them all and fix/workaround the issues.

Thank you for testing, I would appreciate if you test later versions, too!
I will focus on landing the utility function for adding const first. I noticed false positives in LLVM as well, I try to reproduce them all and fix/workaround the issues.

I will test it as often as you rebase it :) Thank you for the useful tool!

Are you going to keep updating it on https://github.com/JonasToth/llvm-project/commits/feature_check_const ?

Are you going to keep updating it on https://github.com/JonasToth/llvm-project/commits/feature_check_const ?

Yup, thats my backup if something goes wrong and to synchronize my pcs. :)

JonasToth abandoned this revision.Dec 30 2019, 7:45 AM

Abonding this revision in favor of: https://reviews.llvm.org/D54943
It doesn't make sense to have both open at the same time, as the const-transformation happens in its own patch and no heavy lifting is done here.