Page MenuHomePhabricator

[clang-tidy] Add ExprMutationAnalyzer, that analyzes whether an expression is mutated within a statement.
ClosedPublic

Authored by shuaiwang on Apr 15 2018, 6:20 PM.

Details

Summary

(Originally started as a clang-tidy check but there's already D45444 so shifted to just adding ExprMutationAnalyzer)

ExprMutationAnalyzer is a generally useful helper that can be used in different clang-tidy checks for checking whether a given expression is (potentially) mutated within a statement (typically the enclosing compound statement.)

This is a more general and more powerful/accurate version of isOnlyUsedAsConst, which is used in ForRangeCopyCheck, UnnecessaryCopyInitialization.

It should also be possible to construct checks like D45444 (suggest adding const to variable declaration) or https://bugs.llvm.org/show_bug.cgi?id=21981 (suggest adding const to member function) using this helper function.

This function is tested by itself and is intended to stay generally useful instead of tied to any particular check.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

I like your refactoring very much and i think we have a state that is close to commit.

My clang-tidy check is already based on top of your functionality, but i can not work on it this weekend and next week is already busy for me. I decided to analysis values and references only for now, and work on pointer semantics later, because of some uncertainty how to handle different things.

Right now, nothing comes to my mind that might be missing. Maybe you could add small helpers and utilities, that take a varDecl and run the analysis (implemented in my check). That would move the last piece into this section :)

clang-tidy/utils/ASTUtils.cpp
125 ↗(On Diff #142882)

Are move and copy constructor covered by this?

149 ↗(On Diff #142882)

I think i was mostly irritated and it is irrelevant with the new semantics (which i think are good for now!).

222 ↗(On Diff #142882)

Ok.

233 ↗(On Diff #142882)

I was wondering about the ternaryOperator and if some suprises might arise from it.

Thats something i will investigate next week within the clang-tidy check.

JonasToth added a comment.EditedApr 20 2018, 3:12 AM

Something came to my mind:

conversion operators. Do they behave as normal overloaded operators? The tests should reflect both a const conversion and a modifying one.

Edit: Already covered

JonasToth added a comment.EditedApr 28 2018, 10:09 AM

There are false positives.

void false_positive() {
// FIXME False positive
int np_local0 = 42;               
// CHECK-MESSAGES: [[@LINE-1]]:3: warning: variable 'np_local0' of type 'int' can be declared const
const int &r0_np_local0 = np_local0;
int &r1_np_local0 = np_local0;
r1_np_local0 = 43;
}
// FIXME false positive
int np_local0 = 42;
// CHECK-MESSAGES: [[@LINE-1]]:3: warning: variable 'np_local0' of type 'int' can be declared const
const int *const p0_np_local0 = &np_local0;
int *const p1_np_local0 = &np_local0;
*p1_np_local0 = 43;
// FIXME False positive
ConstNonConstClass np_local0;
// CHECK-MESSAGES: [[@LINE-1]]:3: warning: variable 'np_local0' of type 'ConstNonConstClass' can be declared const

np_local0.constMethod();
np_local0.nonConstMethod();
// FIXME array access was modifying
ConstNonConstClass np_local0[2];
// CHECK-MESSAGES: [[@LINE-1]]:3: warning: variable 'np_local0' of type 'ConstNonConstClass [2]' can be declared const
np_local0[0].constMethod();
np_local0[1].constMethod();
np_local0[1].nonConstMethod();
// FIXME false positive
int np_local5[3] = {1, 2, 3};
// CHECK-MESSAGES: [[@LINE-1]]:3: warning: variable 'np_local5' of type 'int [3]' can be declared const [cppcoreguidelines-const]
int &np_local6 = np_local5[1] < np_local5[2] ? np_local5[0] : np_local5[2];
np_local6 = 42;
// FIXME false positive
int np_local7[3] = {1, 2, 3};
// CHECK-MESSAGES: [[@LINE-1]]:3: warning: variable 'np_local7' of type 'int [3]' can be declared const [cppcoreguidelines-const]
int *np_local8 = np_local7[1] < np_local7[2] ? &np_local7[0] : &np_local7[2];
*np_local8 = 42;
shuaiwang marked 11 inline comments as done.

Added an isMutated overload for Decl.
A few more test cases.

I migrated to the new Decl interface and adjusted my tests, there are no false positives left and I am not aware of more possible code constellation that would require testing.
If you have time you could check my tests, maybe you find something I missed.

Otherwise this is LGTM from me, but the reviewers must of course accept it (@aaron.ballman, @alexfh)

aaron.ballman added inline comments.May 2 2018, 11:18 AM
clang-tidy/utils/ExprMutationAnalyzer.cpp
43

Can elide the braces.

72

Don't use auto here.

83–84

Why does this not return the result of findMutation() like the other call above?

89

Should this also consider a DeclRefExpr to a volatile-qualified variable as a direct mutation?

What about using Expr::HasSideEffect()?

shuaiwang updated this revision to Diff 145121.May 3 2018, 5:24 PM
shuaiwang marked 4 inline comments as done.

Addressed review comments, notably added check for DeclRefExpr to volatile variables.

clang-tidy/utils/ExprMutationAnalyzer.cpp
83–84

find*Mutation is intended to return where the given Expr is mutated.
To make it a bit more useful, for Decl's I'm returning the DeclRefExpr and caller can chose to follow it to find where the DeclRefExpr is mutated if needed.
As a concrete example:

struct A { int x; };
struct B { A a; };
struct C { B b; };
C c;
C& c2 = c;
c2.b.a.x = 10;

If we start with DeclRefExpr to c we'll find mutation at a DeclRefExpr to c2, following that we can find the mutation Stmt being c2.b.a.x = 10.
Returning E here makes it possible to reveal intermediate checkpoint instead of directly saying c is mutated at c2.b.a.x = 10 which might be confusing.

89

Good catch about DeclRefExpr to volatile.

HasSideEffects means something different. Here find*Mutation means find a Stmt in ancestors that mutates the given Expr. HasSideEffects IIUC means whether anything is mutated by the Expr but doesn't care about what exactly is mutated.

shuaiwang retitled this revision from [clang-tidy] Add a helper function isModified, that checks whether an expression is modified within a statement. to [clang-tidy] Add ExprMutationAnalyzer, that analyzes whether an expression is mutated within a statement..May 3 2018, 5:26 PM
shuaiwang edited the summary of this revision. (Show Details)
JonasToth added inline comments.May 4 2018, 1:23 AM
clang-tidy/utils/ExprMutationAnalyzer.cpp
89

May I ask the exact semantics for volatile? I would add the test to my check, too.

It is possible to write const volatile int m = 42;, but if i understand correctly m could still change by hardware, or would this be UB?
If the const is missing, one must always assume external modification, right?

aaron.ballman added inline comments.May 4 2018, 5:59 AM
clang-tidy/utils/ExprMutationAnalyzer.cpp
89

HasSideEffects means something different. Here find*Mutation means find a Stmt in ancestors that mutates the given Expr. HasSideEffects IIUC means whether anything is mutated by the Expr but doesn't care about what exactly is mutated.

What I am getting at is that the code in findDirectMutation() seems to go through a lot of effort looking for expressions that have side effects and I was wondering if we could leverage that call instead of duplicating the effort.

89

May I ask the exact semantics for volatile? I would add the test to my check, too.

Basically, volatile objects can change in ways unknown to the implementation, so a volatile read must always perform an actual read (because the object's value may have changed since the last read) and a volatile write must always be performed as well (because writing a value may have further side effects if the object is backed by some piece of interesting hardware).

It is possible to write const volatile int m = 42;, but if i understand correctly m could still change by hardware, or would this be UB?

That could still be changed by hardware, and it would not be UB. For instance, the variable might be referencing some hardware sensor and so you can only read the values in (hence it's const) but the values may be changed out from under you (hence, it's not constant).

If the const is missing, one must always assume external modification, right?

You have to assume external modification regardless.

shuaiwang updated this revision to Diff 145313.May 4 2018, 3:36 PM

Revert "DeclRefExpr to volatile handling" part of last update.

shuaiwang marked 5 inline comments as done.May 4 2018, 3:49 PM

I've reverted the handling of DeclRefExpr to volatile after rethinking about it.

The purpose of this analyzer is to find whether the given Expr is mutated within a scope, but doesn't really care about external changes. In other words we're trying to find mutations explicitly written within the specified scope.

clang-tidy/utils/ExprMutationAnalyzer.cpp
89

Unfortunately I don't think HasSideEffects help in here. We're finding one particular side effect that is mutating the specified Expr so we need to have explicitly equalsNode(Exp) for all the matches. HasSideEffects can simply claim f(a, b) or even g() has side effects (given void f(Foo&, const Bar&), void g()) while we need to require a link to the specific Expr we're analyzing (i.e. f(a, b) is mutating a but not b)

I've reverted the handling of DeclRefExpr to volatile after rethinking about it.

The purpose of this analyzer is to find whether the given Expr is mutated within a scope, but doesn't really care about external changes. In other words we're trying to find mutations explicitly written within the specified scope.

Okay, that seems like a good reason to ignore volatile qualifiers. Thank you for the explanation!

clang-tidy/utils/ExprMutationAnalyzer.cpp
89

Ah, okay, thank you for the explanation.

I think you may want to look at the implementation for HasSideEffect() to identify other cases you may want to support. For instance, adding tests for statement expressions, paren expressions, bizarre things like VLA mutations in otherwise unevaluated contexts, etc.

shuaiwang updated this revision to Diff 146531.May 13 2018, 3:08 PM
shuaiwang marked 3 inline comments as done.

Handle unevaluated expressions.

aaron.ballman added inline comments.May 18 2018, 5:41 AM
clang-tidy/utils/ExprMutationAnalyzer.cpp
22–26

This can be replaced with return llvm::find(Node.capture_inits(), E);

35–39

I think these should be exposed as public AST matchers, as they're both generally useful. It can be done in a follow-up patch, or done before landing this patch, either is fine by me.

68

What about other unevaluated expressions, like typeof, _Generic, etc? You should search for ExpressionEvaluationContext::Unevaluated in Clang to see where we set up an unevaluated expression evaluation context to find all of the situations.

73

What is this trying to match with the typeLoc() matcher?

75–76

I think these are only approximations to testing whether it's unevaluated. For instance, won't this match these expressions, despite them being evaluated?

// C++
#include <typeinfo>

struct B {
  virtual ~B() = default;
};

struct D : B {};

B& get() {
  static B *b = new D;
  return *b;
}

void f() {
  (void)typeid(get()); // Evaluated, calls get()
}

/* C99 and later */
#include <stddef.h>

void f(size_t n) {
  (void)sizeof(int[++n]); // Evaluated, increments n
}
george.karpenkov resigned from this revision.May 30 2018, 4:30 PM
shuaiwang updated this revision to Diff 149659.Jun 3 2018, 4:15 PM
shuaiwang marked 5 inline comments as done.

Handle sizeof on VLA.
Added test case for typeof()
Added TODO for handling typeid and generic selection.

clang-tidy/utils/ExprMutationAnalyzer.cpp
35–39

Will leave as follow up patch.

68

I checked around and I think these are all we care about:

  • decltype/typeof
  • sizeof/alignof
  • noexcept
  • _Generic
  • typeid

I've added TODO for _Generic and typeid for now because I think those need a bit more work to implement, while at the same time not supporting them for now only creates false positives from isMutated which is what we prefer over false negatives.

73

Added comment to explain.

75–76

Fixed handling of sizeof(VLA)
Added TODO for typeid

shuaiwang updated this revision to Diff 149661.Jun 3 2018, 5:05 PM
shuaiwang marked 4 inline comments as done.

Handle typeid and generic selection.

Overestimated the work of supporting typeid and _Generic, done now.

aaron.ballman accepted this revision.Jun 5 2018, 12:42 AM

Aside from a minor commenting nit, this LGTM.

clang-tidy/utils/ExprMutationAnalyzer.cpp
79–82

I think this comment is stale now.

This revision is now accepted and ready to land.Jun 5 2018, 12:42 AM
shuaiwang updated this revision to Diff 150026.Jun 5 2018, 12:24 PM
shuaiwang marked an inline comment as done.

Remove stale comment

Thanks a lot for the review!

Could you also help commit this diff as well? Thanks!

@shuaiwang I can commit it in 2 hours for you if you want, after my lecture.

Hmm. SVN does not want  me to commit anything :/

I will retry later today, but maybe i cant commit.

Best, Jonas

Am 06.06.2018 um 10:47 schrieb Aaron Ballman:

Thanks a lot for the review!

Could you also help commit this diff as well? Thanks!

There are at least two previous accepted&committed differentials (D17586, D45702),
i think you can ask for commit bit now.

aaron.ballman closed this revision.Jun 13 2018, 7:46 AM

I've commit in r334604 with one minor change -- I added an include for <cctype> to the unit test so that std::isspace() would compile properly on all platforms.

aaron.ballman reopened this revision.Jun 13 2018, 8:07 AM

I had to revert due to failing tests. The revert was done in r334606 and this is an example of a failing bot: http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast/builds/31500

This revision is now accepted and ready to land.Jun 13 2018, 8:07 AM
shuaiwang updated this revision to Diff 151244.Jun 13 2018, 2:02 PM

Don't include <typeinfo> in unit tests, should fix the test failures.

shuaiwang updated this revision to Diff 151245.Jun 13 2018, 2:05 PM

Add include <cctype> for std::isspace() (thanks aaron.ballman!)

I had to revert due to failing tests. The revert was done in r334606 and this is an example of a failing bot: http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast/builds/31500

Apparently I can't include <typeinfo> in unit test cases (somehow that works on my machine :p)
Changed to just forward declare std::type_info instead.

It might be the case, that the test is run with -no-stdinc (or similar),
so the standard library is not available.

Am 13.06.2018 um 23:08 schrieb Shuai Wang via Phabricator:

shuaiwang added a comment.

In https://reviews.llvm.org/D45679#1131115, @aaron.ballman wrote:

I had to revert due to failing tests. The revert was done in r334606 and this is an example of a failing bot: http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast/builds/31500

Apparently I can't include <typeinfo> in unit test cases (somehow that works on my machine :p)
Changed to just forward declare std::type_info instead.

Repository:

rCTE Clang Tools Extra

https://reviews.llvm.org/D45679

It might be the case, that the test is run with -no-stdinc (or similar),
so the standard library is not available.

Tests should be self-contained and must not depend on any system headers. Standard library implementations may differ, which would make tests brittle. Instead, tests should mock implementations of the APIs they work with (if necessary, multiple times - once for every distinct standard library version the check supports).

Could someone help commit this now that the build issue with header include is fixed? Thanks a lot!
(meanwhile I'm requesting commit access)

It might be the case, that the test is run with -no-stdinc (or similar),
so the standard library is not available.

Tests should be self-contained and must not depend on any system headers. Standard library implementations may differ, which would make tests brittle. Instead, tests should mock implementations of the APIs they work with (if necessary, multiple times - once for every distinct standard library version the check supports).

Yep. It doesn't include any headers anymore :)

This revision was automatically updated to reflect the committed changes.
alexfh added inline comments.Jun 28 2018, 5:14 AM
unittests/clang-tidy/ExprMutationAnalyzerTest.cpp
582

FYI, this test had to be fixed for a ps4 buildbot. See r335799 for details.

@aaron.ballman @alexfh @shuaiwang Would it be possible to move that code into a matcher, or into a something which could be used from Clang? We would also like to use similar functionality, but can't include stuff from clang-tidy.

@aaron.ballman @alexfh @shuaiwang Would it be possible to move that code into a matcher, or into a something which could be used from Clang? We would also like to use similar functionality, but can't include stuff from clang-tidy.

I think it should in theory be possible, though I need some advice about where exactly is the best place in clang.

I think clang/lib/Analysis is a good place.

I think it should in theory be possible, though I need some advice about where exactly is the best place in clang.

Repository:

rCTE Clang Tools Extra

https://reviews.llvm.org/D45679

lebedev.ri added inline comments.Sep 10 2018, 5:13 AM
clang-tidy/utils/ExprMutationAnalyzer.h
38

@shuaiwang, @JonasToth hi.
Is it very intentional that this findDeclMutation() is private?

Is there some other way i should be doing if i have a statement S,
a declRefExpr D, and i want to find the statement Q, which mutates
the underlying variable to which the declRefExpr D refers?

lebedev.ri added inline comments.Sep 10 2018, 5:17 AM
clang-tidy/utils/ExprMutationAnalyzer.h
38

(the statement Q within the statement S, of course)

lebedev.ri added inline comments.Sep 10 2018, 7:14 AM
clang-tidy/utils/ExprMutationAnalyzer.h
38

@shuaiwang after a disscussion about this in IRC with @JonasToth, i have filed https://bugs.llvm.org/show_bug.cgi?id=38888
But i'm failing to CC you there. Are you not registered in the bugzilla?

@aaron.ballman @alexfh @shuaiwang Would it be possible to move that code into a matcher, or into a something which could be used from Clang? We would also like to use similar functionality, but can't include stuff from clang-tidy.

FYI I haven't forgot about this, I'll look into doing it after a few pending changes are in.

clang-tidy/utils/ExprMutationAnalyzer.h
38

There's no real reason findDeclMutation is private other than that there wasn't a use case :)
Feel free to make it public if you find it useful that way.

I'll take a look at the bug, thanks for reporting it!
(I don't have an account there yet, I'm requesting one right now, will follow up in the bug)

@shuaiwang What are you working on next? Do you have an idea on how we
will handle the pointee of a pointer? That would be very interesting for
my const-correctness check. if you need help with anything, just say so.
I would like to support

Am 10.09.2018 um 18:49 schrieb Shuai Wang via Phabricator:

shuaiwang added a comment.

In https://reviews.llvm.org/D45679#1183116, @george.karpenkov wrote:

@aaron.ballman @alexfh @shuaiwang Would it be possible to move that code into a matcher, or into a something which could be used from Clang? We would also like to use similar functionality, but can't include stuff from clang-tidy.

FYI I haven't forgot about this, I'll look into doing it after a few pending changes are in.

================
Comment at: clang-tidy/utils/ExprMutationAnalyzer.h:38
+ const Stmt *findDeclMutation(ArrayRef<ast_matchers::BoundNodes> Matches);
+ const Stmt *findDeclMutation(const Decl *Dec);
+


lebedev.ri wrote:

lebedev.ri wrote:

lebedev.ri wrote:

@shuaiwang, @JonasToth hi.
Is it very intentional that this findDeclMutation() is private?

Is there some other way i should be doing if i have a statement S,
a declRefExpr D, and i want to find the statement Q, which mutates
the underlying variable to which the declRefExpr D refers?

(the statement Q within the statement S, of course)

@shuaiwang after a disscussion about this in IRC with @JonasToth, i have filed https://bugs.llvm.org/show_bug.cgi?id=38888
But i'm failing to CC you there. Are you not registered in the bugzilla?

There's no real reason findDeclMutation is private other than that there wasn't a use case :)
Feel free to make it public if you find it useful that way.

I'll take a look at the bug, thanks for reporting it!
(I don't have an account there yet, I'm requesting one right now, will follow up in the bug)

Repository:

rCTE Clang Tools Extra

https://reviews.llvm.org/D45679

I have a rough idea about how findPointeeMutation would look like, I'm pretty sure I can use a lot of your help :)
My current plan:

  • Finish the few existing pending changes
  • Move the analyzer to clang/lib/Analysis (likely remove PseudoConstantAnalysis there as well since it appears to be dead code for years)
  • Implement findPointeeMutation

@shuaiwang What are you working on next? Do you have an idea on how we
will handle the pointee of a pointer? That would be very interesting for
my const-correctness check. if you need help with anything, just say so.
I would like to support

Am 10.09.2018 um 18:49 schrieb Shuai Wang via Phabricator:

shuaiwang added a comment.

In https://reviews.llvm.org/D45679#1183116, @george.karpenkov wrote:

@aaron.ballman @alexfh @shuaiwang Would it be possible to move that code into a matcher, or into a something which could be used from Clang? We would also like to use similar functionality, but can't include stuff from clang-tidy.

FYI I haven't forgot about this, I'll look into doing it after a few pending changes are in.

================
Comment at: clang-tidy/utils/ExprMutationAnalyzer.h:38
+ const Stmt *findDeclMutation(ArrayRef<ast_matchers::BoundNodes> Matches);
+ const Stmt *findDeclMutation(const Decl *Dec);
+


lebedev.ri wrote:

lebedev.ri wrote:

lebedev.ri wrote:

@shuaiwang, @JonasToth hi.
Is it very intentional that this findDeclMutation() is private?

Is there some other way i should be doing if i have a statement S,
a declRefExpr D, and i want to find the statement Q, which mutates
the underlying variable to which the declRefExpr D refers?

(the statement Q within the statement S, of course)

@shuaiwang after a disscussion about this in IRC with @JonasToth, i have filed https://bugs.llvm.org/show_bug.cgi?id=38888
But i'm failing to CC you there. Are you not registered in the bugzilla?

There's no real reason findDeclMutation is private other than that there wasn't a use case :)
Feel free to make it public if you find it useful that way.

I'll take a look at the bug, thanks for reporting it!
(I don't have an account there yet, I'm requesting one right now, will follow up in the bug)

Repository:

rCTE Clang Tools Extra

https://reviews.llvm.org/D45679

I have a rough idea about how findPointeeMutation would look like, I'm pretty sure I can use a lot of your help :)
My current plan:

  • Finish the few existing pending changes
  • Move the analyzer to clang/lib/Analysis (likely remove PseudoConstantAnalysis there as well since it appears to be dead code for years)
  • Implement findPointeeMutation

Sounds like a solid plan. I think the clang-tidy/utils/DeclRefExprUtils.{h,cpp} are not necessary anymore, too. They are used in 2 clang-tidy checks. Maybe i can remove them as well :)