This is an archive of the discontinued LLVM Phabricator instance.

[analyzer] Enabling MallocChecker to take up after SmartPtrModelling
Needs ReviewPublic

Authored by RedDocMD on Mar 16 2021, 10:55 AM.

Details

Summary

This I think should not be a TODO, since MallocChecker indeed does track
released pointers. I have added a test for that.

Diff Detail

Event Timeline

RedDocMD created this revision.Mar 16 2021, 10:55 AM
RedDocMD requested review of this revision.Mar 16 2021, 10:55 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 16 2021, 10:55 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
steakhal requested changes to this revision.Mar 16 2021, 11:37 AM

I don't think the TODO is addressed.
By checking the git blame quickly, there was no change committed to the SmartPtrChecker affecting the collaboration with the MallocChecker after the TODO was introduced in the source code.
Thus, I think the TODO is probably not yet addressed.

For example this code does not trigger any warning:
https://godbolt.org/z/aM4xe7

void other(A *oldptr) {
  std::unique_ptr<A> P(oldptr);
  A* aptr = P.release();
  delete aptr;
  P->foo(); // No warning here, and probably this is case that the TODO want to describe.
}

Nevertheless, this limitation deserves a test case with a FIXME, if it's not there already.

clang/lib/StaticAnalyzer/Checkers/SmartPtrModeling.cpp
16

Why do you include this?

This revision now requires changes to proceed.Mar 16 2021, 11:37 AM
NoQ added a comment.Mar 16 2021, 1:02 PM

For example this code does not trigger any warning:
https://godbolt.org/z/aM4xe7

FTFY: https://godbolt.org/z/vT5c7s

This test isn't very interesting though because the warning doesn't have anything to do with the smart pointer. Any use of the pointer after delete deserves a warning regardless of whether it's coming from the smart pointer or not. In other words, in this test case MallocChecker doesn't start tracking the pointer at release() but it only starts tracking the pointer later, at delete.

A more interesting test would be to diagnose a leak if the pointer isn't deleted. Which we don't do: https://godbolt.org/z/1EczEW So, yeah, i think this TODO is still relevant.

RedDocMD updated this revision to Diff 331164.Mar 16 2021, 10:43 PM

Removed unnecessary include

@NoQ, regarding https://godbolt.org/z/1EczEW, I don't think there should be a warning since the original pointer may be needed in the caller.
But thanks for pointing out the possible leaks due to release! :)
In the line that you have pointed out, I have thought about the following cases which should have a warning (but currently don't):

RedDocMD retitled this revision from [analyzer] Remove unnecessary TODO to [analyzer] Enabling MallocChecker to take up after SmartPtrModelling.Apr 3 2021, 1:51 AM
NoQ added a comment.Apr 5 2021, 10:49 AM

@NoQ, regarding https://godbolt.org/z/1EczEW, I don't think there should be a warning since the original pointer may be needed in the caller.

You're absolutely right! I missed this point!

I still think it's worth a few experiments to see if the very presence of unique_ptr in the code is a good-enough indication that the callee expects unique ownership. But no, that's definitely not the point of that TODO.

Here's a related example where the caller technically can still rely on keeping the pointer but that situation would be even more absurd: https://godbolt.org/z/Mnos89ehK - basically the only way for the caller to access the raw pointer value before it gets destroyed is from within a temporary destructor that precedes the destructor of the unique_ptr argument. Which is valid but i'm leaning heavily towards still emitting a warning.

Another example: https://godbolt.org/z/on13Kv1q4 - this is definitely questionable but probably still deserves experimentation.

Ok so long story short, the only information that we gain about the raw pointer on .release() is that (assuming the default deleter is used) it's some memory allocated by new. We already knew (no pun intended) that about the raw pointer but .release() sounds like a good place to re-attach it. We do not really gain knowledge that we have to release the value but we can try to see what happens if we pretend we do. I guess we could rephrase the TODO to reflect that - "experiment with detecting leaks of the raw pointer after .release()" or something like that. We also definitely don't gain the knowledge that the value is already released (it definitely isn't, despite the method's name).

But thanks for pointing out the possible leaks due to release! :)
In the line that you have pointed out, I have thought about the following cases which should have a warning (but currently don't):

In these cases MallocChecker should already be tracking the pointer even before it was put into unique_ptr. So i guess we should investigate where/why this information gets lost. But we shouldn't be re-attaching it; it should be already there.

I think something is wrong about the way we are investigating this or I don't understand the MallocChecker.
The following doesn't yield a bug report! => https://godbolt.org/z/Y57G7zE5j

Okay, this is alarming.
The following code yields a leak warning:

#include <iostream>
#include <memory>

struct A {
  int field;
  A(int f = 0) : field{f} {}
  void foo() { std::cout << "Field: " << field << "\n"; }
};

void foo() {
  int *raw = new int(10);
  std::cout << *raw << "\n";
}

But the following doesn't:

#include <iostream>
#include <memory>

struct A {
  int field;
  A(int f = 0) : field{f} {}
  void foo() { std::cout << "Field: " << field << "\n"; }
};

void foo() {
  std::unique_ptr<A> P{new A(10)}; // <-- This is the additional line
  int *raw = new int(10);
  std::cout << *raw << "\n";
}

So the mere presence of a unique_ptr seems to turn off/cripple the MallocChecker!

As you can see here, the bug note is attached in the second case (please refer to my previous comment), but somehow, it doesn't finally show up as a bug.
Weird.

NoQ added a comment.Apr 26 2021, 12:01 AM

I think something is wrong about the way we are investigating this or I don't understand the MallocChecker.
The following doesn't yield a bug report! => https://godbolt.org/z/Y57G7zE5j

Uh-oh, we forgot to enable cplusplus.NewDeleteLeaks which is a separate checker. https://godbolt.org/z/cfzdTM8vf

As you can see here, the bug note is attached in the second case (please refer to my previous comment), but somehow, it doesn't finally show up as a bug.
Weird.

This means that one of the bug visitors has invoked PathSensitiveBugReport::markInvalid() to suppress the report. You can break at this function to find out more.

The call to PathSensitiveBugReport::markInvalid() is triggered by LikelyFalsePositiveSuppressionBRVisitor::finalizeVisitor(). So I guess we need to see now why the visitor thinks this is a false positive.

Judging by this line in the LikelyFalsePositiveSuppressionBRVisitor::finalizeVisitor() method, it seems that the bug report is squelched when the visitor encounters an ExplodedNode which corresponds to a LocationContext, whose associated Decl lies in std namespace. I guess, by default, the option to suppress warnings from the std library is enabled. Which makes sense, except in this case since unique_ptr is in std and it is being used in that function, the bug report is suppressed.

NoQ added a comment.EditedApr 27 2021, 4:29 PM

when the visitor encounters an ExplodedNode

Weird. finalizeVisitor() accepts not any node but the error node. Your screenshot suggests that the error node is not in the standard library but in user code. Might it be that there are multiple error nodes and you're looking at the wrong one? As usual, you can set conditional breakpoints by node IDs.

In D98726#2721228, @NoQ wrote:

when the visitor encounters an ExplodedNode

Weird. finalizeVisitor() accepts not any node but the error node. Your screenshot suggests that the error node is not in the standard library but in user code. Might it be that there are multiple error nodes and you're looking at the wrong one? As usual, you can set conditional breakpoints by node IDs.

I had set a breakpoint on the function finalizeVisitor(), unconditionally. And it stops exactly once, on the node I had sent the screenshot of. That said, I am going to repeat the experiment, removing all use of std::cout (and iostream), and report back.

In D98726#2721228, @NoQ wrote:

when the visitor encounters an ExplodedNode

Weird. finalizeVisitor() accepts not any node but the error node. Your screenshot suggests that the error node is not in the standard library but in user code. Might it be that there are multiple error nodes and you're looking at the wrong one? As usual, you can set conditional breakpoints by node IDs.

Okay no you are right. I was looking at the wrong error node. There are two. Lets take the following code:

#include <memory>

int foo() {
  std::unique_ptr<int> P(new int(10));
  int *raw = new int(13);
  int b = *raw;
  return b;
}

There is clearly a leak but that is getting squelched. Now for the two error nodes:

  • This is the error node we expect, it points to our source code.

  • This is another error, somewhere in the unique_ptr.h file.

The second one is the one that causes the first one to get squelched. I suppose how this point was reached was due to the StaticAnalyzer trying to reason about the destructor of unique_ptr, which is implicitly called in at the end of the function.
For reference, here is the dot file for the exploded graph dump.

Ok so if I was to debug this further, I'd set a breakpoint on markInvalid(), figure out which invocation corresponds to the right bug report, and that'd lead me to an explanation why this bug report was suppressed.

This could be an interesting investigation because it may uncover a bug in the suppression mechanism which causes us to have false negatives all over the place. False negatives are hard to notice because they only manifest as an absence of warning which isn't very in-your-face.

On the other hand, this specific issue will most likely be eliminated as soon as you replace unique_ptr methods' inlining with direct modeling.


Coming back to the patch in general, even though I had some ideas in D98726#2669315, I have a feeling that I'm forgetting something. @xazax.hun @vrnithinkumar, IIRC last year we added this FIXME not only because we wanted to find more bugs but also because we actually had some correctness problems here, do you remember what they were?

Judging by this line in the LikelyFalsePositiveSuppressionBRVisitor::finalizeVisitor() method, it seems that the bug report is squelched when the visitor encounters an ExplodedNode which corresponds to a LocationContext, whose associated Decl lies in std namespace. I guess, by default, the option to suppress warnings from the std library is enabled. Which makes sense, except in this case since unique_ptr is in std and it is being used in that function, the bug report is suppressed.

This is what causes the false suppression. To be more specific, the analyzer tries to follow the logic of the destructor of unique_ptr into the standard library. And since that is in the std namespace, it causes LikelyFalsePositiveSuppressionBRVisitor::finalizeVisitor() to squelch the report. Now there are two problems here:

  • Why does the analyzer try to follow the logic into the standard library? Is it because we haven't explicitly modeled it? (Then, as you said, modelling this method will solve the issue).
  • Why is there a bug in unique_ptr.h? (This is the worse of the two, IMO). I am going to take a look at the standard library code (sigh) and see if that's an actual bug or another false positive.
NoQ added a comment.Jun 6 2021, 5:48 PM

This is what causes the false suppression. To be more specific, the analyzer tries to follow the logic of the destructor of unique_ptr into the standard library. And since that is in the std namespace, it causes LikelyFalsePositiveSuppressionBRVisitor::finalizeVisitor() to squelch the report. Now there are two problems here:

Which of the two reports are you describing here?

  • Why is there a bug in unique_ptr.h? (This is the worse of the two, IMO). I am going to take a look at the standard library code (sigh) and see if that's an actual bug or another false positive.

If the warning is / the bug path ends in unique_ptr.h it doesn't mean that the bug is in unique_ptr.h in particular or in the standard library in general. That's the thing with path sensitive analysis: all nodes of the path are potentially responsible for the bug. Take away any statement in the middle and the entire thing might never happen. This is why path-sensitive checkers will most likely never provide fix-it hints: we aren't supposed to know at all which part of the path was the problem, all we know is that the path in its entirety leads to an instance of problematic behavior.

For this reason we're also not massively suppressing all warnings that are displayed against the C standard library headers (unlike clang-tidy). In case of clang-tidy even if there's a bug in the standard library the user will probably be unable to fix it, so it's entirely their job to suppress such warnings. In case of the static analyzer the warning in the standard library isn't an indication that the standard library needs to be fixed at all; most of the time the user can handle it in their code. What we do do, however, is avoid analyzing standard library functions as analysis entry points - as it's a good indication that the entire bug path (and, therefore, the entire list of things to blame) is going to be in the standard library.

We are suppressing warnings against C++ standard library headers though, as you've correctly pointed out. This was a judgement call because it looks like there are too many false positives among these warnings. We should really reconsider this suppression in the future, it could most likely be made much more fine-grained.

We probably did not update this PR with the discussions we had offline. Basically we had a bug with a sink node due to calling into a destructor of unique_ptr. The catch is that the ctor was evalcalled, so the store did not have the correct bindings. Thus, after inlining the dtor the analyzer though we are reading garbage values from memory and a sink node was generated. SuppressOnSink machinery suppressed the leak warning.

The solution here is probably to EvalCall on the dtor as well so the analyzer does not end up thinking the code is reading garbage values from memory.

What's the status on this?

The issues highlighted here have been partially (one could argue mostly) solved by D105821. That patch hasn't been merged because it needs more tests and also needs to be split.