This is an archive of the discontinued LLVM Phabricator instance.

[analyzer] MoveChecker Pt.2: Restrict the warning to STL objects and locals.
ClosedPublic

Authored by NoQ on Nov 14 2018, 5:09 PM.

Details

Summary

This patch implements the post important part of the plan proposed in D54556. Under alpha.cplusplus.Move:Aggressive=true, the checker retains the original behavior, but when that option is equal to false (which is the proposed default behavior), the checker only warns in the following three cases:

  1. If the object that's being used after move is of a known STL type (the word "known" here makes no sense; what i'm trying to say is that we'll rant about objects of any STL type, unless we encounter false positives even there and it'll cause us to restrict the check even further). The point here is that according to the C++ Standard, STL objects are known to remain in "valid but unspecified state", which means that the state is going to be internally consistent state which is not known. A lot of operations clearly make no sense (eg., asking for the value of the first character in a string that has just been moved makes little sense), and we know it and we can warn in these cases reliably. If some operations do make sense or, moreover, if some operations return the object into a valid state, we can easily hardcode these operations in the checker because STL has limited amount of stuff in it.
  2. If the object is a local or parameter variable. This not only correlates with the true positives that i've seen so far, but also kinda makes sense, though, as usual, i'd love to be proven wrong. When an object is, say, a field in a class, or, even worse, a global, then even if you move from that object, the object is still there. It won't go anywhere; it *will* have to be re-used eventually. Which means that if the object is left in a well-specified state after getting moved, there's no good reason to enforce the developer to reset it manually. However, if the object is a local that is going to be cleaned up soon anyway, it is very unlikely that the desire to re-use its storage is well-justified. It should be totally fine to create another local and use that as a storage for the next object that you need. Hence the idea.
  3. If the object is an rvalue reference. The idea is exactly the same as in 2: an xvalue is something that's near the end of its lifetime, so there's no temptation to re-use the storage. Whoever passed us that value is already fine with it being moved away, we have no obligation to fill in the space again.

Diff Detail

Repository
rC Clang

Event Timeline

NoQ created this revision.Nov 14 2018, 5:09 PM
NoQ added a comment.Nov 14 2018, 6:54 PM

So, like, with locals it should be possible to easily and intuitively suppress the false positive, even if there is one, by simply using an extra variable, and additionally there's the [[clang::reinitializes]] attribute added by @xazax.hun that will help people suppress the warning when their object has a popular state-reset method that Static Analyzer doesn't understand. I believe this is a good trade-off between usefulness and usability, given that most positives with locals i've seen so far looked like real bugs.

xazax.hun accepted this revision.Nov 15 2018, 4:50 AM

It would be great to have a way to extend the list of (possibly non-stl) types to check. But I do understand that the analyzer does not have a great way to set such configuration options right now.

This revision is now accepted and ready to land.Nov 15 2018, 4:50 AM

I think we should either remove the non-default functionality (which wouldn't be ideal), or emphasise somewhere (open projects?) that there is still work to be done, but leaving it to be forgotten and essentially making it an extra maintenance work would be, in my optinion, the worst case scenario. Aggressive isn't Pedantic because it actually emits warnings on correct code, and it's not a simple matter of too many reports being emitted, let's also document that this is an experimental feature, not a power-user-only thing.

Whenever I compile Rtags, I get a bunch of move related warnings, I'm curious how this patch behaves on that project. I'll take a look.

lib/StaticAnalyzer/Checkers/MoveChecker.cpp
93–96

How about a simple public data member? Since all callbacks are const, it wouldn't be modified after registration anyways.

135

You use dyn_cast_or_null, which to me implies that the returned value may be nullptr, but you never check it on the call sites.

274–283

I've seen checker option descriptions in registry functions, in-class where they are declared, and on top of the file, I think any of them would be more ideal. Since my checker option related efforts are still far in the distance (which will put all of them in Checkers.td), let's move (or at least copy) this somewhere else.

test/Analysis/use-after-move.cpp
47–52

Any particular reason for not using cxx-system-header-simulator.h?

Alpha checkers, at least to developers, are clearly stating "Please finish me!", but a non-alpha, enabled-by-default checker with a flag that you'd never know about unless you looked at the source code (at least up until I fix these) sounds like it'll be forgotten like many other similar flags.

NoQ added a comment.Nov 15 2018, 4:45 PM

Whenever I compile Rtags, I get a bunch of move related warnings, I'm curious how this patch behaves on that project. I'll take a look.

Whoops, sry, i accidentally had a look because i misread your message as if you wanted me to take a look >.<

Spoiler alert!

It finds one positive (and one duplicate of that one positive):

I believe this positive is a real bug, but we can still do better here. We find it as a use-after-move of a local variable, which is not a bug on its own, i.e. the user intended to empty and then re-use the storage and it's fine, this is not a usual kind of typo where the user uses the pre-move variable instead of the post-move variable. But the real bug here is that this local variable uses an std::string as a field under the hood, which is, well, not guaranteed to be empty after move, like all other STL objects: https://stackoverflow.com/questions/27376623/can-you-reuse-a-moved-stdstring

NOTE: As also mentioned in this stackoverflow thread, we also need to exclude smart pointers from the STL check because they don't end up in unspecified state, and see if there are other cornercases here.

Unfortunately, we don't find this bug as an STL use-after-move bug because inlining isn't happening. Why? I guess i'll leave it as an excercise ^.^ It's a combination of running out of budget and a specific feature that we have. By flipping a single -analyzer-config flag that represents that feature, we are able to find such bugs as STL bugs (when local variable bugs are disabled in the checker). We're still not able to find the original bug, most likely due to running out of budget (i didn't debug this further), but we can find it in a minimal example:

#include "rct/Rct.h"

void foo() {
  String S1;
  String S2 = std::move(S1);
  S1 += "asdfg"; // use-after-move of a std::string
}

Here's the report that we are able to obtain for this trivial code snippet, and you can look up the answer to the exercise in the collapsed run-line :)

NoQ added a comment.Nov 15 2018, 5:50 PM

I think we should either remove the non-default functionality (which wouldn't be ideal), or emphasise somewhere (open projects?) that there is still work to be done, but leaving it to be forgotten and essentially making it an extra maintenance work would be, in my optinion, the worst case scenario. Aggressive isn't Pedantic because it actually emits warnings on correct code, and it's not a simple matter of too many reports being emitted, let's also document that this is an experimental feature, not a power-user-only thing.

I only kept the option around because i was under an impression that i'm intruding into a checker that already has some happy users, probably breaking existing workflows. If this option is unnecessary, i'd be happy to remove it :)

NoQ added a comment.Nov 15 2018, 5:51 PM

It would be great to have a way to extend the list of (possibly non-stl) types to check. But I do understand that the analyzer does not have a great way to set such configuration options right now.

Do you envision room for another attribute here? I.e., a class attribute that says "this object is always unsafe to use after move, unless a method annotated with reinitializes is called"?

In D54557#1300581, @NoQ wrote:

Whenever I compile Rtags, I get a bunch of move related warnings, I'm curious how this patch behaves on that project. I'll take a look.

Whoops, sry, i accidentally had a look because i misread your message as if you wanted me to take a look >.<

Spoiler alert!

It finds one positive (and one duplicate of that one positive):

I believe this positive is a real bug, but we can still do better here. We find it as a use-after-move of a local variable, which is not a bug on its own, i.e. the user intended to empty and then re-use the storage and it's fine, this is not a usual kind of typo where the user uses the pre-move variable instead of the post-move variable. But the real bug here is that this local variable uses an std::string as a field under the hood, which is, well, not guaranteed to be empty after move, like all other STL objects: https://stackoverflow.com/questions/27376623/can-you-reuse-a-moved-stdstring

NOTE: As also mentioned in this stackoverflow thread, we also need to exclude smart pointers from the STL check because they don't end up in unspecified state, and see if there are other cornercases here.

OMG That is cool. :D That project was fishy. Nice catch, would you like to make an issue on their project or shall I?

Unfortunately, we don't find this bug as an STL use-after-move bug because inlining isn't happening. Why? I guess i'll leave it as an excercise ^.^ It's a combination of running out of budget and a specific feature that we have. By flipping a single -analyzer-config flag that represents that feature, we are able to find such bugs as STL bugs (when local variable bugs are disabled in the checker). We're still not able to find the original bug, most likely due to running out of budget (i didn't debug this further), but we can find it in a minimal example:

#include "rct/Rct.h"

void foo() {
  String S1;
  String S2 = std::move(S1);
  S1 += "asdfg"; // use-after-move of a std::string
}

Here's the report that we are able to obtain for this trivial code snippet, and you can look up the answer to the exercise in the collapsed run-line :)

Thanks! :) *throws confetti*

In D54557#1300653, @NoQ wrote:

I think we should either remove the non-default functionality (which wouldn't be ideal), or emphasise somewhere (open projects?) that there is still work to be done, but leaving it to be forgotten and essentially making it an extra maintenance work would be, in my optinion, the worst case scenario. Aggressive isn't Pedantic because it actually emits warnings on correct code, and it's not a simple matter of too many reports being emitted, let's also document that this is an experimental feature, not a power-user-only thing.

I only kept the option around because i was under an impression that i'm intruding into a checker that already has some happy users, probably breaking existing workflows. If this option is unnecessary, i'd be happy to remove it :)

Hmm, I'll ask around, but I'm not aware of any ongoing (or planned in the near future) work on this particular checker.

In D54557#1300653, @NoQ wrote:

I think we should either remove the non-default functionality (which wouldn't be ideal), or emphasise somewhere (open projects?) that there is still work to be done, but leaving it to be forgotten and essentially making it an extra maintenance work would be, in my optinion, the worst case scenario. Aggressive isn't Pedantic because it actually emits warnings on correct code, and it's not a simple matter of too many reports being emitted, let's also document that this is an experimental feature, not a power-user-only thing.

I only kept the option around because i was under an impression that i'm intruding into a checker that already has some happy users, probably breaking existing workflows. If this option is unnecessary, i'd be happy to remove it :)

Hmm, I'll ask around, but I'm not aware of any ongoing (or planned in the near future) work on this particular checker.

Yup, there seems to be a desire to keep it around. Let's add an entry to the open projects maybe?

NoQ updated this revision to Diff 174475.Nov 16 2018, 4:08 PM

Give the rvalue reference test function a sensible name.

NoQ added a comment.Nov 16 2018, 4:22 PM

Nice catch, would you like to make an issue on their project or shall I?

I guess it can be an outright pull request :) I'll see if i have time for that soon, please feel free to take initiative><

Yup, there seems to be a desire to keep it around. Let's add an entry to the open projects maybe?

Mmm, what is the project here? What changes are still necessary?

In D54557#1301869, @NoQ wrote:

Nice catch, would you like to make an issue on their project or shall I?

I guess it can be an outright pull request :) I'll see if i have time for that soon, please feel free to take initiative><

I'll do the honors then. :)

Yup, there seems to be a desire to keep it around. Let's add an entry to the open projects maybe?

Mmm, what is the project here? What changes are still necessary?

Wasn't the point of this patch to turn off part of this checkers functionality because it's immature just yet? From what I understand it is desired, but the FP rate is a little too high. I guess fixing that is the project.

Hmm, actually, tinkering with HTML files might be overkill, especially since sphinx will hopefully end that era. Let's just add a TODO and let me deal with it later when I reorganize the checker options. Sorry for talking nonsense :D

In D54557#1300654, @NoQ wrote:

It would be great to have a way to extend the list of (possibly non-stl) types to check. But I do understand that the analyzer does not have a great way to set such configuration options right now.

Do you envision room for another attribute here? I.e., a class attribute that says "this object is always unsafe to use after move, unless a method annotated with reinitializes is called"?

Exactly :) My only concern is that I doubt users will end up passing such options using command line options. Having file base configuration options would be more convenient. They can be easily checked in the repository and evolve together with the product (like the .clang-tidy files in the LLVM repos).

NoQ added a comment.Nov 29 2018, 1:18 PM

Wasn't the point of this patch to turn off part of this checkers functionality because it's immature just yet? From what I understand it is desired, but the FP rate is a little too high. I guess fixing that is the project.

Hmm, actually, tinkering with HTML files might be overkill, especially since sphinx will hopefully end that era. Let's just add a TODO and let me deal with it later when I reorganize the checker options. Sorry for talking nonsense :D

I would probably fine with shipping this other part as a lint check if we start shipping lint checks. I don't have any specific improvements in mind for that part. If somebody is willing to enforce "don't ever use things after move" policy in his code, this will be a good checker for such person. At the same time, i don't see how to improve that part so much that it can be enabled by default. I think projects that consist of "i believe it's impossible but you can try to come up with something" kind aren't very good for the open projects list because the person who will try to work on them may get disappointed when he realizes that his few at-a-glance solutions are bad and in fact nobody ever had any better solutions in mind.

In D54557#1300654, @NoQ wrote:

It would be great to have a way to extend the list of (possibly non-stl) types to check. But I do understand that the analyzer does not have a great way to set such configuration options right now.

Do you envision room for another attribute here? I.e., a class attribute that says "this object is always unsafe to use after move, unless a method annotated with reinitializes is called"?

Exactly :) My only concern is that I doubt users will end up passing such options using command line options. Having file base configuration options would be more convenient. They can be easily checked in the repository and evolve together with the product (like the .clang-tidy files in the LLVM repos).

I mean, attribute, like, annotation in the source code.

Szelethus accepted this revision.Nov 29 2018, 3:43 PM
In D54557#1313382, @NoQ wrote:

Wasn't the point of this patch to turn off part of this checkers functionality because it's immature just yet? From what I understand it is desired, but the FP rate is a little too high. I guess fixing that is the project.

Hmm, actually, tinkering with HTML files might be overkill, especially since sphinx will hopefully end that era. Let's just add a TODO and let me deal with it later when I reorganize the checker options. Sorry for talking nonsense :D

I would probably fine with shipping this other part as a lint check if we start shipping lint checks. I don't have any specific improvements in mind for that part. If somebody is willing to enforce "don't ever use things after move" policy in his code, this will be a good checker for such person. At the same time, i don't see how to improve that part so much that it can be enabled by default. I think projects that consist of "i believe it's impossible but you can try to come up with something" kind aren't very good for the open projects list because the person who will try to work on them may get disappointed when he realizes that his few at-a-glance solutions are bad and in fact nobody ever had any better solutions in mind.

Yea, that makes sense.

This revision was automatically updated to reflect the committed changes.