This is an archive of the discontinued LLVM Phabricator instance.

[clang-tidy] Handle unique owning smart pointers in ExprMutationAnalyzer
ClosedPublic

Authored by shuaiwang on Aug 16 2018, 9:41 PM.

Details

Summary

For smart pointers like std::unique_ptr which uniquely owns the
underlying object, treat the mutation of the pointee as mutation of the
smart pointer itself.

This gives better behavior for cases like this:

void f(std::vector<std::unique_ptr<Foo>> v) { // undesirable analyze result of `v` as not mutated.
  for (auto& p : v) {
      p->mutate(); // only const member function `operator->` is invoked on `p`
  }
}

Event Timeline

shuaiwang created this revision.Aug 16 2018, 9:41 PM

I am suprised that this does not automatically follow from the general rules. At the end, smartpointers cant do anything else then 'normal' classes.

The operator+/-> were not handled before? The mutation of SmartPtr x; x->mf(); should already be catched, not?

I am suprised that this does not automatically follow from the general rules. At the end, smartpointers cant do anything else then 'normal' classes.

The operator+/-> were not handled before? The mutation of SmartPtr x; x->mf(); should already be catched, not?

Different from std::vector::operator[] which has two overloads for const and non-const access, std::unique_ptr only has one const version of operator->.
So for SmartPtr x; x->mf(); we only see a const operator being invoked on x. mf is not a member of SmartPtr and the member call to mf is not on x directly, we never followed that far.

I see, thank you :)

Am 17.08.2018 um 18:55 schrieb Shuai Wang via Phabricator:

shuaiwang added a comment.

In https://reviews.llvm.org/D50883#1203690, @JonasToth wrote:

I am suprised that this does not automatically follow from the general rules. At the end, smartpointers cant do anything else then 'normal' classes.

The operator+/-> were not handled before? The mutation of SmartPtr x; x->mf(); should already be catched, not?

Different from std::vector::operator[] which has two overloads for const and non-const access, std::unique_ptr only has one const version of operator->.
So for SmartPtr x; x->mf(); we only see a const operator being invoked on x. mf is not a member of SmartPtr and the member call to mf is not on x directly, we never followed that far.

Repository:

rCTE Clang Tools Extra

https://reviews.llvm.org/D50883

Different from std::vector::operator[] which has two overloads for const and non-const access, std::unique_ptr only has one const version of operator->.
So for SmartPtr x; x->mf(); we only see a const operator being invoked on x. mf is not a member of SmartPtr and the member call to mf is not on x directly, we never followed that far.

I think the operator-> is transitively called, isn't it? (see andrei alexandrescu talk here: https://youtu.be/ozOgzlxIsdg?t=15m55s where he gives a nice class for generic locking)
Maybe there is something more general at work? I think @aaron.ballman knows more about deeper language questions :)

Different from std::vector::operator[] which has two overloads for const and non-const access, std::unique_ptr only has one const version of operator->.
So for SmartPtr x; x->mf(); we only see a const operator being invoked on x. mf is not a member of SmartPtr and the member call to mf is not on x directly, we never followed that far.

I think the operator-> is transitively called, isn't it? (see andrei alexandrescu talk here: https://youtu.be/ozOgzlxIsdg?t=15m55s where he gives a nice class for generic locking)
Maybe there is something more general at work? I think @aaron.ballman knows more about deeper language questions :)

Right, for x->mf() (where x is some kind of smart pointer), operator-> is invoked on the smart pointer itself, and then that yields a (real) pointer, in which case the operator-> magically reappears and is applied on the real pointer.
For now we're basically treating SmartPtr::operator->() the same as "taking the address of Exp", which is treated as immediately mutating Exp. The benefit of this approach is: when we're able to do findPointeeMutation, we'll be able to further follow the pointer and see whether the pointee is mutated, and we'll be able to distinguish between these two cases:

struct Foo {
void mf();
void cmf() const;
};
std::unique_ptr<Foo> p;
p->mf(); // mutates Foo
p->cmf(); // doesn't mutate Foo, but is treated as mutation for now. Can be improved when we're able to do findPointeeMutation( <pointer returned by operator->() call on p> )

Ofc the current limitation with assuming always modification stays with my proposed tests. But these are the tests we need once implemented full analysis of pointers.

unittests/clang-tidy/ExprMutationAnalyzerTest.cpp
776

Template testcases i miss:

// modifying
template <typename T>
void f() { UnqiuePtr<T> x; x->mf(); }

// constant
template <typename T>
void f2() { UnqiuePtr<T> x; x->cmf(); }

// indecidable for the template itself, but only the instantiations
template <typename T>
void f3() { T x; x->cmf(); }

struct const_class { void cmf() const; }
struct modifying_class { void cmf(); };

void call_template() {
// don't trigger
f3<UniquePtr<const_class>>();
// trigger modification
f3<random_class*>();
}

// again not decidable by the template itself
template <typename T>
void f4() { T t; *t; }

struct very_weird {
    int& operator*() { return *new int(42); }
};
void call_template_deref() {
  // no modification
  f4<int*>();
  // modification, because deref is not const
  f4<UniquePtr<very_weird>>():
}
shuaiwang marked an inline comment as done.

rebase & add test case

unittests/clang-tidy/ExprMutationAnalyzerTest.cpp
776

Added a case with template. However I don't think we can do much whenever template appears: only the AST of uninstantiated template worth analyzing, because whatever analyze result (diag or fixit) would have to be applied to the template itself not the instantiations.
So the fact that the template argument of f3 or f4 could happen to be UniquePtr doesn't really matter when we analyze f3 and f4, we have to analyze the way assuming the "worst" anyway.

JonasToth added inline comments.Sep 11 2018, 2:26 AM
unittests/clang-tidy/ExprMutationAnalyzerTest.cpp
776

Yes, whenever there is something type-dependent we have to bail, thats why i would like to see the testcases to show that we actually bail, even when one instantiation would not modify.

I believe we do analyze all versions of the templated functions (uninstantiated and every instantiation), don't we? With that there can be conflicting results.

shuaiwang added inline comments.Sep 11 2018, 9:14 AM
unittests/clang-tidy/ExprMutationAnalyzerTest.cpp
776

I see, it's the conflicting results you're going after :)
Good news is that we actually don't analyze all versions, we only analyze the version (instantiated or not) corresponding to the "scope" stmt passed into the constructor. Semantic-wise I feel this makes sense because if we're given an instantiated version we shouldn't bail out because nothing is type-dependent anymore in the instantiated version.
Also I think conflicts won't happen much in practice, most (all?) checks naturally pass in the uninstantiated version, in order to pass in an instantiated version a check needs to:

  • Find an instantiation point
  • Match and extract the function decl from the callExpr
  • Extract function body compontStmt from the function decl

at that point the check owner likely knows pretty well what they're doing and shouldn't be surprised that analyze results conflicts if they happen to also analyze an uninstantiated version.

Your point is valid, that the decision of what to analyze should be done
outside. My const-correctness check does analyze all versions of the
templated function, because it just matches on
functionDecl(compoundStmt()).

Maybe we just need some experience with real world code. The
const-correctness thing is close to finish in its first version. Then we
can exercise its results.

Am 11.09.2018 um 18:15 schrieb Shuai Wang via Phabricator:

I see, it's the conflicting results you're going after :)
Good news is that we actually don't analyze all versions, we only analyze the version (instantiated or not) corresponding to the "scope" stmt passed into the constructor. Semantic-wise I feel this makes sense because if we're given an instantiated version we shouldn't bail out because nothing is type-dependent anymore in the instantiated version.
Also I think conflicts won't happen much in practice, most (all?) checks naturally pass in the uninstantiated version, in order to pass in an instantiated version a check needs to:

  • Find an instantiation point
  • Match and extract the function decl from the callExpr
  • Extract function body compontStmt from the function decl

at that point the check owner likely knows pretty well what they're doing and shouldn't be surprised that analyze results conflicts if they happen to also analyze an uninstantiated version.

Repository:

rCTE Clang Tools Extra

https://reviews.llvm.org/D50883

Yeah let's see what happens in the wild and decide whether we need further actions. In any case I think that deserves a separate diff.
Is there other concerns about this diff?

Your point is valid, that the decision of what to analyze should be done
outside. My const-correctness check does analyze all versions of the
templated function, because it just matches on
functionDecl(compoundStmt()).

Maybe we just need some experience with real world code. The
const-correctness thing is close to finish in its first version. Then we
can exercise its results.

Am 11.09.2018 um 18:15 schrieb Shuai Wang via Phabricator:

I see, it's the conflicting results you're going after :)
Good news is that we actually don't analyze all versions, we only analyze the version (instantiated or not) corresponding to the "scope" stmt passed into the constructor. Semantic-wise I feel this makes sense because if we're given an instantiated version we shouldn't bail out because nothing is type-dependent anymore in the instantiated version.
Also I think conflicts won't happen much in practice, most (all?) checks naturally pass in the uninstantiated version, in order to pass in an instantiated version a check needs to:

  • Find an instantiation point
  • Match and extract the function decl from the callExpr
  • Extract function body compontStmt from the function decl

at that point the check owner likely knows pretty well what they're doing and shouldn't be surprised that analyze results conflicts if they happen to also analyze an uninstantiated version.

Repository:

rCTE Clang Tools Extra

https://reviews.llvm.org/D50883

This revision is now accepted and ready to land.Sep 11 2018, 10:19 AM
This revision was automatically updated to reflect the committed changes.