This is an archive of the discontinued LLVM Phabricator instance.

[Analyzer] New Option for ContainerModeling: AggressiveEraseModeling
Needs ReviewPublic

Authored by baloghadamsoftware on Mar 31 2020, 9:19 AM.

Details

Summary

A typical (bad) algorithm for erasing elements of a list is the following:

for(auto i = L.begin(); i != L.end(); ++i) {
   if (condition(*i))
     i = L.erase(i);
}

If condition() returns true for the last element, then erase() returns the past-the-end iterator which the loop tries to increment which leads to undefined behavior. However, the iterator range checker does not find this bug because the "loop iteration before that last iteration" cannot be recognized by the analyzer. Instead we added an option in this patch to ContainerModeling which enables the modeling checker to assume after every erase that the result is the past-the-end iterator (if this case is possible).

Diff Detail

Event Timeline

Do you have a lit test as well? It would be useful to have one that is similar to the one you mention in the review's summary.

clang/lib/StaticAnalyzer/Checkers/ContainerModeling.cpp
788

This seems to be very similar to the hunk we have in handleErase, so this suggests some unnecessary code duplication. I presume we could create a function which is called from both callers.

baloghadamsoftware marked an inline comment as done.Apr 1 2020, 10:25 AM
baloghadamsoftware added inline comments.
clang/lib/StaticAnalyzer/Checkers/ContainerModeling.cpp
788

Yes, I was thinking about that as well. Many parameters have to be passed so it will be not so much shorter.

Some tests would be appreciated, It really helps me understand whats going on.

clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
688–692

Hmm. The description isn't really user-friendly, imo, but the option doesn't sound too user-facing either. I don't think this is an option to be tinkered with. I think we should make this hidden.

Also, even for developers, I find this a bitconfusing at first. Do we not enable that by default? Why not? What do we have to gain/lose?

clang/lib/StaticAnalyzer/Checkers/ContainerModeling.cpp
693–700

Right, so the const is a state split. That doesn't sound like something regular users should fine-tune.

693–700

cost* :^)

baloghadamsoftware marked an inline comment as done.Apr 3 2020, 12:04 AM
baloghadamsoftware added inline comments.
clang/lib/StaticAnalyzer/Checkers/ContainerModeling.cpp
693–700

We made something like this (upon suggestion from @NoQ) in STLAlgorithmModeling. There is an option which is disabled by default, but if the user enables it, then std::find() and the like will aggressively assume that the element may not be found thus they return their second iterator parameter. This option is very similar to that one.

Szelethus added inline comments.Apr 3 2020, 12:55 PM
clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
715–721

Ah, okay, I see which one you refer to. We should totally make this non-user facing as well.

baloghadamsoftware marked an inline comment as done.Apr 4 2020, 2:04 AM
baloghadamsoftware added inline comments.
clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
715–721

The option is not about state split! It is for choosing between the (default) conservative approach and a more aggressive one. It is absolutely for the user to set. Some users prefer less false positive for the price of losing true positives. However, some other users prefer more true positives for the price of additional false positives. This is why we have checker options to be able to serve both groups.

This feature was explicitly requested by our users who first disabled iterator checkers because of the too many false positives but later re-enabled them because they run into a bug in a production system which could have been prevented by enabling them. However, they run into another bug that our checker did not find because of its conservative behavior. They requested a more aggressive one but we must not do it for everyone. The concept of the Analyzer is that we apply the conservative approach by default and the aggressive can be enabled by analyzer and checker options.

NoQ added inline comments.Apr 6 2020, 11:48 AM
clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
688–692

What is the rule that the user needs to follow in order to avoid the warning? "Always check the return value of erase() before use"? If so, a good description would be along the lines of "Warn when the return value of erase() is not checked before use; compare that value to end() in order to suppress the warning" (or something more specific).

715–721

These options are unlikely to be on-by-default in vanilla clang because of having a fundamentally unfixable class of false positives associated with them. Honestly, i've never seen users who are actually ok with false positives. I've regularly seen users say that they are ok with false positives when they wanted their false negatives fixed, but it always turned out that they don't understand what they're asking for and they quickly changed their mind when they saw the result. But you guys basically vend your own tool to your own customers and bear your own responsibility and i therefore can't tell you what to do in your realm, and i'm ok with having options that allow you to disagree with my choices.

inb4, "-analyzer-config is not user-facing!!!11"

Please mind that my inline is NOT an objection to this patch.

clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
715–721

inb4, "-analyzer-config is not user-facing!!!11"

I was starting to breath really heavy and type really fast :D

Jokes aside, I'll spill the beans, we don't have a really convenient way to tweak these options through CodeChecker just yet, so for the time being, they really aren't, but maybe one day.

With that said, I also agree with you regarding options that overapproximate the analysis (greater code coverage with more false and true positives). User facing options should be the ones where the nature of the bug itself if subjective, like whether we want to check the initializedness of pointees. We sure can regard them as unwanted, or we could regard them as totally fine, so an option is a great compromise.

[...] it always turned out that they don't understand what they're asking for [...]

This is why I think this shouldn't be user-facing. You need to be quite knowledgeable about the analyzer, and IteratorChecker in particular to make good judgement about enabling the option added in this patch. That doesn't mean you shouldn't experiment with it, but I'd prefer to not expose it too much.

I just don't see this feature being that desired by many if they knew the costs, even if one user really wants it. We could however, in that case, say
"Hey, there is this option you can enable that might help you hunt down more bugs. It does change the actual analysis itself, and experience showed that it might result in more false positives and could impact performance, so we opted not to make it user-facing, but it is there, if you feel adventurous."

Szelethus added inline comments.Apr 6 2020, 12:49 PM
clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
715–721

Granted, I might be demonizing the cons of these options too much, but the description doesn't help on understanding it that much either O:)

baloghadamsoftware marked an inline comment as done.Jul 1 2020, 4:06 AM
baloghadamsoftware added inline comments.
clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
715–721

So you both are saying that I should tell the user that we cannot help, the community does not support that he catches this bug in early phase with the analyzer. He must catch it in the testing phase for higher costs. This is not the best way to build confidence in the analyzer.

In my perception options are used to serve one users need without harming others. I never understand why there is so much resistance against introducing an option that does not change the default behavior at all.

Rebased. Tests added. Fixed to work on std::vector-like and std::deque-like containers as well.

NoQ added inline comments.Jul 7 2020, 9:34 AM
clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
715–721
  1. That's right, implementing every single suggestion users have is a terrible idea. Not everything is possible, not everything is feasible. Populism should not be put ahead of common sense. As an extreme example, imagine you're in the 80's and a user complained that they need a voice recognition feature in their computer ("Why do I have to type all this text, it's so slow???"). Would you be ready to inform them that it'll take 30 years and a few revolutionary breakthroughs in computer science to implement? Would you be ready to tell them that learning how to touch-type will make their typing faster than anything they could ever achieve with dictation? Because that's your job as a programmer to be assertive and to be able to say "no" or "maybe later" or "you're not using the program correctly" when user demands don't make sense or are impossible to implement or if the user doesn't know what they're asking for. Developers of every single program that has any users at all constantly struggle with this.
  1. Supporting multiple configurations is much harder than supporting a single configuration. I wish I could say that it will be entirely up to you to maintain these configurations, including the situations when it doesn't cause problems yet but obstructs future development. However, I can't say that, due to 3.
  1. You can't control who uses your option once it's landed in open-source. By committing the option to the upstream LLVM, you're not shipping this option to just one user, you're shipping it to the entire world.
  1. You can't simply say "I know it may sometimes work incorrectly but I tested it and it works for me". The option may behave fine on codebases in your company but completely ruin reputation of the static analyzer by producing a lot of false positives on a famous open-source project that you didn't test it on. This is why we care a lot about correctness and not introducing new sources of false positives.
  1. You may say that the users know what they're doing when they turn the option on, and that they're willing to accept some false positives. That's not true though; you don't personally brief every user of your option. They may try it simply because they heard about it from someone else. They're not always willing to read documentation. One user may enable the option for all other developers and they may never learn that this was an option to begin with.
  1. In particular, i believe that false positives destroy confidence in the analyzer much more than false negatives. I also don't trust the users when they say they don't care about false positives. I heard this many times from a variety of people but when I gave them what they wanted they immediately started complaining about false positives every single time.
  1. Static analysis is not a one-size-fits-all solution to bug finding. Instead of trying to find a way to find every kind of bug statically, you should be trying to find the best tool for the job. There are areas in which static analysis excels, like finding exotic execution paths not covered by tests, but it's not good for everything.
  1. Static analysis is not necessarily cheaper than testing or code review or dynamic analysis. Understanding a single analyzer report may take hours.
  1. False positives reduce cost-effectiveness of static analysis dramatically. They may cause the user to spend a lot of time trying to understand an incorrect warning, or the users may introduce a new bug when they try to silence the warning without understanding the warning and/or the surrounding code.
  1. Last but not least, honestly, we have significantly more important problems to fix. If you claim that adding an option makes a few users happy without changing anything for anybody else, then you must also admit that you're doing a lot of work for just a small portion of users, when you could spend your time (and our time) benefiting *every* user instead.

So, like, options aren't too bad on their own but they're definitely not as free as you describe. You still need to think about *all* users who can potentially access the option, not just the one who's in front of you. "The analyzer produces false positives under an option" still means that the analyzer produces false positives in one of the supported configurations. Marking features as opt-in doesn't give you an excuse for shipping bad features.

There is a parallel discussion in this patch on how we should cater to user requests, I wrote a lengthy comment about my opinion, but I just don't think it adds much -- at the end of the day, it is fair that if we implement an feature for a small subset of users that is known to cause unpleasant side effects, we shouldn't make it a part of the default user interface, because among many other things it inevitably forces other contributors to maintain it, even if they weren't enthusiastic about the idea, but its also fair to prioritize direct users within reason.

Since this is in the same bullpark as D32905, I think this should land in some way or another.

(@baloghadamsoftware) So you both are saying that I should tell the user that we cannot help, the community does not support that he catches this bug in early phase with the analyzer. He must catch it in the testing phase for higher costs. This is not the best way to build confidence in the analyzer. [...]

Nobody said we cannot help. What I say, is that we should not task the broad analyzer community with the maintenance of an option that is known to make state splits that we might not be untitled to. In fact,

(@NoQ) [...] But you guys basically vend your own tool to your own customers and bear your own responsibility and i therefore can't tell you what to do in your realm, and i'm ok with having options that allow you to disagree with my choices.

I don't see how this option shouldn't be alpha -- it creates paths of execution that might be impossible. If I understand this correctly, we would emit a false positive here:

std::vector<int> V = get(); // constructs this const vector every time: { 1, 2, 3 }
auto i = V.erase(std::find(V.begin(), V.end(), 1)); // state split is creater where i is v.end()
*i;

As of now, the only way to fine-tune the analyzer through CodeChecker (edit saargs.txt, add the flag --saargs saargs.txt or something like that) is a quite a jarring interface, and is also buggy (like this thing). Eventually, it would be great to make the -analyzer-checker-option-help list visible and easily configurable. I wonder however, if I was a newcomer to CodeChecker, what my reaction would be an option that says "Enables exploration of the past-the-end branch for the return value of the erase() method of containers". Like, why wouldn't I enable that? Lets say we add a disclaimer that it could emit false positives. Lets say I stumble across one. Am I supposed to create a bug report, or is this intended behavior? Not the mention that CodeChecker might not be the only tool to have plans to support -analyzer-checker-option-help.

To be honest, I probably shouldn't even see these options straight away. Since this option creates state splits we might not be entitled to, we need more testing and data to get to a point where we could safely put in the default interface -- these are literally the traits of what we call alpha checkers/options. Its just simply not finished, and no amount of documentation will change that. Once I have a better feel for how the analyzer behaves or I'm really stuck, I may wanna go and tick a few alpha stuff as a last resort. This is why we should categorize such option under a different flag.

But, as you said in D77150#inline-707863, there is value in these forsaken options after all. They have on occasion found real bugs that saved a lot of time, and we have gotten a lot of feedback on alpha checkers that also helped us fix a lot of issues. But those are still users that decided to opt-in to in-development features. In retrospect, D32905 should've been alpha for this very reason as well.

dkrupp requested changes to this revision.EditedJul 14 2020, 3:55 AM

Since the analyzer cannot cannot model the size of the containers just yet (or precisely enough), what we are saying with this checker is to "always check the return value of the erase() function before use (for increment/decrement etc.) whether it is not past-end" .

Adam, are you sure that the user would like to enforce such a generic coding rule? Depending on the actual code analyzed, this would require this clumsy check at potentially too many places.

Wouldn't it be better to wait for the container size modeling patch? Then the user would only get a warning when we are sure that we are erasing the last element of the container.

In general I think we should keep the number of config parameters to a minimum as it is very hard to explain to the users which one to configure to what value and why.

Oops, I did not mean "request changes". This is just meant to be an additional comment in this discussion.

This revision now requires changes to proceed.Jul 14 2020, 3:55 AM

I am trying to measure the amount of extra false positive on real code (not artificial examples) of this option. The modeling of container size may reduce it, of course, but we have no other option to find this bug: even loop-unrolling or loop-widening is not be able to recognize the last before the last iteration of a loop. I cannot imagine any solution to recognize it, this really sounds like voice recognition in the 80's as @NoQ wrote. I do not expect such thing in the next 10 years. Refining and enhancing container modeling is expected to reduce false positives generally in the iterator checkers and will eventually result in moving them out of alpha state.

Now I made some measurements about the false positives this option adds.

For BitCoin the increase was -1. I do not know how it happened, but it reduced the number of false positives by one.

For LLVM/Clang the increase was 15 which is less than 4% beacause we have 419 findings which is enormous. Thus the real problem that scares the user is not the 15 false positives but the 419. To reduce them we must refine the modeling of containers which is also expected to reduce the 15. (I did not examine the 15 findigs, maybe some of them are true positives.)

Honestly, this was the result I expected. This extra state split should not add many false positives because it is extremely rare that we know that we are erasing an element which is not the last one but this knowledge is not expressed by constraints. Of course, we can create artificial examples where this option gives false positives, but the measurement shows that in real code bases the increase is minimal. However, the bug is a serious one which may cause undefined behavior that may lie hidden in tests.

Szelethus requested changes to this revision.EditedJul 28 2020, 5:29 AM

Let me double down -- just because we let some select people know about an option, I still don't think it should be user facing. I'm strongly against the philosophy that a core-modifying decision, such as configuring a state split should be presented on the default interface. Having "smart users" isn't an excuse to that either -- the last thing I'd like to see broadcasted is that you need to be a genius* to use or configure a tool that is suppose to serve you, not the other way around.

edit: * and most importantly, an expert in the analyzer.

clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
688–692

Please mark this hidden.

This revision now requires changes to proceed.Jul 28 2020, 5:29 AM

Since the analyzer cannot cannot model the size of the containers just yet (or precisely enough), what we are saying with this checker is to "always check the return value of the erase() function before use (for increment/decrement etc.) whether it is not past-end" .

Adam, are you sure that the user would like to enforce such a generic coding rule? Depending on the actual code analyzed, this would require this clumsy check at potentially too many places.

While I agree that realising in the analyser that the user code at hand is meant to express that the iterator stepping will not go out of bounds, you have to keep in mind, that going out of bounds with an iterator is still UB. In case of a for loop, it's always a dangerous construct to call erase() inside, unless you, indeed, can explicitly ensure that you won't go out of bounds.

Side note: I am surprised there isn't a Tidy check in bugprone- that simply says "If you are using erase(), use a while loop and not a for loop..."

I'm not terribly up-to-date with @baloghadamsoftware's patches (nor the innards of the analyser), but as a user who'd get this warning, I believe one crucial information missing is something like "note: assuming erase() removed the the last element, setting it to the past-the-end iterator"

Oops, I did not mean "request changes". This is just meant to be an additional comment in this discussion.

You can do the Resign as Reviewer action to remove your X from the patch. 🙂

clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
688–692

Is marking something hidden still allow for modifying the value from the command-line, or via CodeChecker?

clang/lib/StaticAnalyzer/Checkers/ContainerModeling.cpp
680–681

Is this creation prevented?

Option made hidden. However, I think we should create a category "advanced" for options which affect the internal modeling. Our users are programmers and the advanced among them should see this option (and others similar to this one).

baloghadamsoftware marked an inline comment as done.Aug 3 2020, 10:31 AM
Szelethus accepted this revision.Aug 4 2020, 7:52 AM
Szelethus added a reviewer: gamesh411.

LGTM on my end, but please wait for approval from either @gamesh411 or @NoQ.

However, I think we should create a category "advanced" for options which affect the internal modeling. Our users are programmers and the advanced among them should see this option (and others similar to this one).

I think that is a discussion for another time, and definitely on cfe-dev.