This is probably a "throw-away" patch which attempts
to model automatic implicit destructor calls.
Details
- Reviewers
NoQ vsavchenko xazax.hun teemperor
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Event Timeline
This is a minimal model of destructors in smart-ptr.
Other than the need to probably model the destructor of the pointee, is there anything else to do?
Yes, I think this should work.
You're invalidating less regions than a normal destructor invalidation would have caused (eg., you're not touching globals). One way to emulate that precisely would be to construct a CallEvent for the destructor and invoke CallEvent::invalidateRegions() on it, which should be relatively easy given that we don't need a pointee destructor expression for this to work (because destructors don't typically have expressions anyway; so this will be much harder in case of make_unique and the constructor as the constructor would demand a construct-expression).
Also it makes sense to omit the invalidation when the pointee type doesn't have a destructor.
But before we go there we should decide whether we want to actually go for inlining (or otherwise default-evaluating) these destructors. If we do, we should probably not spend too much time on improving invalidation in the checker, because default evaluation would do that properly for us anyway (well, it doesn't really dodge any problems such as the absence of the necessary AST so we'll probably have to solve all these problems anyway, just in a different setting). So it's great that we've fixed evalCall for destructors, this could definitely land as a separate patch (tested via debug.AnalysisOrder), but we really need to think what to do next here. So I recommend gathering some data to see if proper destructor evaluation is actually a real problem.
clang/lib/StaticAnalyzer/Checkers/SmartPtrModeling.cpp | ||
---|---|---|
398 | {} are unnecessary because llvm::ArrayRef is implicitly constructible out of a single value. The null expression situation shouldn't be too harmful given that we've already doing this for conservatively evaluated destructors outside of evalCall(). That said, it's still actively incorrect because given that it's part of the symbol's identity, it causes us to use the same abstract symbol for different actual runtime values. I guess a proper fix would involve updating the identity of a SymbolConjured to include a CFGElementRef instead of a statement. Or, well, building a better SVal kind (or maybe even a non-value "marker") specifically for invalidation purposes, which would capture an explanation for invalidation and the role that the value played in it (was it an unknown return value? an unknown out-parameter value? a default value covering invalidated globals? a checker-specific value?) which we could introspect later (say, for suppression purposes). This doesn't seem to be urgent though. |
But before we go there we should decide whether we want to actually go for inlining (or otherwise default-evaluating) these destructors. If we do, we should probably not spend too much time on improving invalidation in the checker, because default evaluation would do that properly for us anyway (well, it doesn't really dodge any problems such as the absence of the necessary AST so we'll probably have to solve all these problems anyway, just in a different setting). So it's great that we've fixed evalCall for destructors, this could definitely land as a separate patch (tested via debug.AnalysisOrder), but we really need to think what to do next here. So I recommend gathering some data to see if proper destructor evaluation is actually a real problem.
MallocChecker doesn't seem to mind not evaluating destructors properly. With the current version of the patch, the following code doesn't emit any warnings:
class SillyPtr { int *ptr; bool wasMalloced; public: SillyPtr(int *ptr, bool wasMalloced = false) : ptr(ptr), wasMalloced(wasMalloced) {} ~SillyPtr() { if (wasMalloced) free(ptr); else delete ptr; } }; void foo() { int *ptr = new int(13); SillyPtr silly(ptr); // No leak here! }
I am going to remove the debug dumps and run this patch on the projects in the clang/utils/analyzer/projects folder. If I don't find any false positives being caused due to this lack of modelling, then I think we can defer the proper handling of destructors (ie, finish up the invalidation) and move on to the other remaining problems (notes on get for an instance).
the following code doesn't emit any warnings
This code doesn't seem to have any unique_ptrs in it? It's not like you're modeling this custom class as well? Can you try the same with the actual unique_ptr?
The following code emits a warning for leaked memory:
#include <memory> class Lame { int *ptr; public: explicit Lame(int *ptr) : ptr(ptr) {} ~Lame() { delete ptr; } }; void foo() { int *ptr = new int(13); auto smart = std::make_unique<Lame>(ptr); // No leak here }
It seems that there is a flaw in the way I was testing for warnings.
Why does the following command not display the warnings? ./llvm/release/bin/clang -std=c++17 -Xclang -analyze -Xclang -analyzer-checker=core,cplusplus.Move,cplusplus.NewDelete,alpha.cplusplus.SmartPtr -Xclang -analyzer-output=text -Xclang -analyzer-config -Xclang cplusplus.SmartPtrModeling:ModelSmartPtrDereference=true -c lame-class.cpp
Wait, what's the difference between this command and the command that did emit the warning for you?
With that specific invocation, apart from the missing cplusplus.NewDeleteLeaks, and apart from noticing that std::make_unique isn't actually getting modeled but inlined instead (I have all your patches pulled and this patch applied; and the object-under-construction seems to be available later, in both C++11 and C++17, so it's kinda weird and needs more debugging), I think the reason this doesn't warn is that the pointer is technically put on the heap (when stored into Lame::ptr) which makes it accessible for the entire program and therefore potentially delete-able by anybody and we suppress the warning (it's kinda frustrating but we already have the opposite problem for locals and we don't really know how to solve either of those: D71041#inline-641497, the reverted D71152, etc.)
So basically it sounds like this is indeed not going to be too big of a problem, given that memory managed by unique_ptr is always heap memory, so everything that's ever put there during analysis is never going to produce any leak warnings. Maybe we could indeed get away with conservative modeling of constructors (inside make_unique) and destructors.
We do need conservative modeling of constructors in make_unique though, otherwise your pointer never reaches Lame::ptr which means an actual leak warning.
Uh-oh, sorry, wrong clang! std::make_unique is indeed modeled. I think the only problem with your invocation is the missing checker. And it also indeed confirms my suggestion that the constructor inside make_unique should be modeled, at least conservatively like we did with destructor.
No, that's the wrong destructor. We don't want to invalidate the smart pointer; we've already modeled it precisely. What i meant was construct a new CallEvent (through CallEventManager) for the destructor of the pointee and use that.
Ah I see.
As a side note, without the "redundant" invalidation that is being done, the analyzer crashes on shared_ptr. (Because the State essentially remains the same and that's what causes the crash).
Regardless of the kind of pointer, sounds like we need to do something about that API quirk. Eg., it *must* be possible to model a destructor of a std::unique_ptr<int> as a no-op when the tracked raw pointer value is an UnknownVal.
clang/lib/StaticAnalyzer/Checkers/SmartPtrModeling.cpp | ||
---|---|---|
147 | And why can't we pass STD_PTR_NAMES directly to llvm::is_contained? | |
393–402 | I suggest to add a ton of comments with the reasoning behind these actions. | |
396 | And if it happens we are going to crash with assertion failure? | |
437–439 | Okay, I'm either missing something or this condition is missing ! here. |
clang/lib/StaticAnalyzer/Checkers/SmartPtrModeling.cpp | ||
---|---|---|
437–439 | And that's the ghost bug I am chasing around for the last few hours. |
clang/lib/StaticAnalyzer/Checkers/SmartPtrModeling.cpp | ||
---|---|---|
396 | We should never crash or fail on valid C++ code. We can abandon everything, forbid checker to report anything on a function that has something that we don't know how to handle properly, but never fail the overall analysis process because of that. |
clang/lib/StaticAnalyzer/Checkers/SmartPtrModeling.cpp | ||
---|---|---|
396 | Ah right, I should have put in a TODO. This assert was put to see how often I run into this (none so far). It must be removed before this patch is accepted. |
On running this patch on the projects directory, a bunch of projects emit false-positives: mostly of the form Potential memory leak. This points to the fact that without calling the destructor of the pointee type, we are going to have a lot of false positives (408 for one project is the worst I have seen). I have attached the result file.
Can you attach all or some of the newly found html reports?
Is this about invalidation in ~unique_ptr() modeling being insufficient, or about us not doing anything at all with pointee on other occasions such as .reset() or might it be that the lack of invalidation for constructor inside make_unique() also plays its part?
Well some of them are exactly the same type as the Lame class example above. Like: simbody/report-TestArray.cpp-testMoveConstructionAndAssignment-27-1.html#EndPath. (So the incomplete modelling of the destructor is at least one cause. The other reason that you suggested might as well be true).
Btw, the destructor1.txt file from my previous comment should be used to drill down the newly added reports. (Sorry for the inconvenience, sshfs is really slow and so it was more convenient to tar the whole folder and scp it).
Well some of them are exactly the same type as the Lame class example above. Like: simbody/report-TestArray.cpp-testMoveConstructionAndAssignment-27-1.html#EndPath. (So the incomplete modelling of the destructor is at least one cause. The other reason that you suggested might as well be true).
The problem with the Lame class above was the constructor in make_unique, not the destructor.
But more importantly, in this case the destructor most likely isn't responsible for freeing memory. The push_back() method most likely *moves* the smart pointer into the array, so by the time ~unique_ptr() hits the smart pointer is already empty, there's nothing to free. What we're missing is a "pointer escape" event for the raw pointer. It sounds like a feature we have to implement: when the smart pointer region pointer-escapes (in this case, due to being passed into an unknown function via rvalue reference), the raw pointer region should also pointer-escape. (Or, if push_back() is inlined then there's a different reason for escape, something along the lines of getting move-assigned into the heap, which we should probably also implement separately!). Damn, that's an interesting can of worms.
I have incorporated the bug-fixes suggested last meeting (except the pointer escape one). And it seems to have had dramatic results - now the only extra errors being reported are the pointer escape ones (5 of them, from 3 different projects). Some projects are actually reporting that bug reports have been removed due to this patch.
clang/lib/StaticAnalyzer/Checkers/SmartPtrModeling.cpp | ||
---|---|---|
288 | Something's not right. Returning true here would discard the state and terminate evalCall as failed. Why compute the invalidated state if we throw it away? |
clang/lib/StaticAnalyzer/Checkers/SmartPtrModeling.cpp | ||
---|---|---|
444 | It seems to me that this pointer escape doesn't work. void foo() { auto ptr = std::unique_ptr<int>(new int(13)); // Leak warning emitted here } the exploded graph shows the SVal for new int(13) as allocated instead of escaped (which eventually triggers the warning). |
clang/lib/StaticAnalyzer/Checkers/SmartPtrModeling.cpp | ||
---|---|---|
444 | It shouldn't work in this case. The variable is local. Write to a local variable doesn't constitute an escape because access to a local variable from elsewhere is impossible. I believe we should explicitly tell MallocChecker that memory is released, given that we know that this is exactly what happens. We could do this similarly to how InnerPointerChecker tells MallocChecker that std::string::c_str() is released when the string is destroyed. Another solution would be to force an escape by calling escapeValue() directly. That'll definitely notify all checkers that the raw pointer value should be dropped but that wouldn't allow us to ultimately find use-after-free of that value. |
The code looks great, I don't see any major problems.
We still need tests, I can't stress this enough. All the real-world cornercases you've covered here as you updated the patch deserve a test case.
Some of these changes should probably be separated into other patches, eg. invalidation and pointer escape for non-destructor operations.
And why can't we pass STD_PTR_NAMES directly to llvm::is_contained?