This is an archive of the discontinued LLVM Phabricator instance.

RedDocMD retitled this revision from [analyzer] Handle std::make_unique for SmartPtrModeling to [analyzer][WIP] Handle std::make_unique for SmartPtrModeling.Jun 5 2021, 11:20 AM

Accounting for std::make_unique_for_overwrite

RedDocMD edited the summary of this revision. (Show Details)Jun 5 2021, 11:25 AM

Harbormaster completed remote builds in B107820: Diff 350067.Jun 5 2021, 12:21 PM

Fixed binding of SVal to variable

The drawback of the current approach is that we are not using the following piece of information:

std::unique_ptr created from std::make_unique is not null (to begin with)

I am not being able to use this info since I don't have access to the raw pointer, so cannot create a SVal and then constrain the SVal to non-null.
Any suggestions @NoQ, @vsavchenko , @xazax.hun, @teemperor?

Harbormaster completed remote builds in B107852: Diff 350103.Jun 6 2021, 8:30 AM

I am not being able to use this info since I don't have access to the raw pointer, so cannot create a SVal and then constrain the SVal to non-null.
Any suggestions @NoQ, @vsavchenko , @xazax.hun, @teemperor?

You can always create a new symbol to represent the inner pointer. Something like this already happens, when you have a unique_ptr formal parameter and call get on it.

In D103750#2801248, @xazax.hun wrote:

You can always create a new symbol to represent the inner pointer. Something like this already happens, when you have a unique_ptr formal parameter and call get on it.

The way handleGet obtains a new symbol cannot really be used here since we do not have an Expr for the inner pointer at hand. handleGet does the following:

C.getSValBuilder().conjureSymbolVal(
        CallExpr, C.getLocationContext(), Call.getResultType(), C.blockCount());

Since we have CallExpr, we can easily conjure up an SVal. But I don't see how I can do it similarly in this patch.

Reformatted code

Herald added a reviewer: bollu. · View Herald TranscriptJun 6 2021, 9:10 AM

Fixed git history

Harbormaster completed remote builds in B107856: Diff 350110.Jun 6 2021, 9:52 AM

Since we have CallExpr, we can easily conjure up an SVal. But I don't see how I can do it similarly in this patch.

You should have a CallExpr for std::make_unique too. I believe that expression is used to determine how the conjured symbol was created (to give it an identity). So probably it should be ok to use the make_unique call to create this symbol (but be careful to create the symbol with the right type).

Tests pls!

Yes I think you should totally do an evalCall() here. The function has no other side effects apart from making a pointer so it's valuable to fully model it so that to avoid unnecessary store invalidations.

In D103750#2801320, @xazax.hun wrote:

Since we have CallExpr, we can easily conjure up an SVal. But I don't see how I can do it similarly in this patch.

You should have a CallExpr for std::make_unique too. I believe that expression is used to determine how the conjured symbol was created (to give it an identity). So probably it should be ok to use the make_unique call to create this symbol (but be careful to create the symbol with the right type).

That's right, a conjured symbol isn't necessarily the value of its respective expression. So you can conjure it and assume that it's non-null. Also consider using a heap conjured symbol (a-la getConjuredHeapSymbolVal()). That said, getConjuredHeapSymbolVal() doesn't provide an overload for overriding the type; there's no reason it shouldn't though, please feel free to extend it.

There's a separate story about SymbolConjured being the right class for the problem. I'm still in favor of having SymbolMetadata instead, so that to properly deduplicate symbols across different branches and merge more paths.

clang/lib/StaticAnalyzer/Checkers/SmartPtrModeling.cpp
198	Please use `CallDescription` for most of this stuff.
347	Please merge with `isStdSmartPtrCall`.
348–351	`getTypePtr()` is almost never necessary because `QualType` has an overloaded `operator->()`.

Made stylistic refactors

RedDocMD marked 3 inline comments as done.Jun 7 2021, 11:04 AM

RedDocMD added inline comments.

clang/lib/StaticAnalyzer/Checkers/SmartPtrModeling.cpp
347	I don't think that can be done because `isStdSmartPtrCall` checks calls like `a.foo()`, checking if `a` is a smart-ptr. On the other hand I need to check for statements like `a = foo()` and ensure that `a` is a smart-ptr.

In D103750#2801555, @NoQ wrote:

Yes I think you should totally do an evalCall() here. The function has no other side effects apart from making a pointer so it's valuable to fully model it so that to avoid unnecessary store invalidations.

Am I not doing that now? I didn't get this point.

Ugh, this entire checkBind hack was so unexpected that I didn't even recognize evalCall. What prevents you from doing everything in evalCall? No state traits, no nothing, just directly take the this-region and attach the value to it?

In D103750#2803499, @NoQ wrote:

Ugh, this entire checkBind hack was so unexpected that I didn't even recognize evalCall. What prevents you from doing everything in evalCall? No state traits, no nothing, just directly take the this-region and attach the value to it?

Because in evalCall, I don't have access to the ThisRegion. At least I think I don't.
So we have a statement like auto a = std::make_unique<int>(100). In evalCall I get a CallEvent corresponding to std::make_unique<int>(100). Using Call.getOriginExpr() I can get the LHS. But that's only the Expr, not the MemRegion. Unlike in a regular constructor call or method call, where ThisRegion can easily be obtained from the Call.

Harbormaster completed remote builds in B108033: Diff 350362.Jun 7 2021, 12:03 PM

In D103750#2803516, @RedDocMD wrote:

In D103750#2803499, @NoQ wrote:

Ugh, this entire checkBind hack was so unexpected that I didn't even recognize evalCall. What prevents you from doing everything in evalCall? No state traits, no nothing, just directly take the this-region and attach the value to it?

Because in evalCall, I don't have access to the ThisRegion. At least I think I don't.

Did you consider trying to get it from CXXInstanceCall::getCXXThisVal? Usually, if you cannot access something you need, that is a bug in the analyzer. But I found that this is very rarely the case. Most of the callbacks are very well thought out and already gives access to most of the information you might want to access.

In D103750#2803516, @RedDocMD wrote:

In D103750#2803499, @NoQ wrote:

Ugh, this entire checkBind hack was so unexpected that I didn't even recognize evalCall. What prevents you from doing everything in evalCall? No state traits, no nothing, just directly take the this-region and attach the value to it?

Because in evalCall, I don't have access to the ThisRegion.

You do have access to it. That's how C++ works: the call constructs the value directly into the target this-region. In code

auto a = std::make_unique<int>(100)

it is known even before std::make_unique is invoked (i.e., even in checkPreCall) that the this-region for the call is going to be a. Because it's up to the call to invoke the constructor of the external object, and you can't invoke a constructor if you don't have a this to pass into it. You cannot ever have an object and not have this.

Even if it wasn't for RVO, you would have known the temporary region in which the smart pointer would originally reside.

This was the whole point of the CFG work described in https://www.youtube.com/watch?v=4n3l-ZcDJNY

You're looking in the wrong place though, as std::make_unique returns the structure by value (aka prvalue), so the value of the expression, even if available before the call, was never going to be this (which would have been the corresponding glvalue). What you're looking for is CallEvent::getReturnValueUnderConstruction().

Whoops, I totally missed your point with my suggestion (should not read it while sitting at meetings). But Artem has a great answer already.

Also congrats, sounds like you are to become the first actual user of this API! Hope it actually works 🤞🤞🤞

Nice. Looks like I have something to read up on. Turns out, I end up learning something new in C++ every now and then. 😃

In D103750#2804438, @RedDocMD wrote:

Nice. Looks like I have something to read up on. Turns out, I end up learning something new in C++ every now and then. 😃

Yeah, it's also pretty interesting ABI-wise. As you know, C++ doesn't have stable ABI. That said, one popular calling convention for functions that return C++ objects by value consists in passing a hidden argument to the function that contains the address at which to construct the return value. Like "this" is an implicit argument to methods, the return address is yet another implicit argument. So you can think of getReturnValueUnderConstruction() as of an accessor for that implicit argument value similarly to how getCXXThisVal() is the accessor for the implicit this-argument value.

This is completely different from C where you can return the structure on stack or even in registers (which can't be pointed to at all); this becomes possible in C because structures can be copied arbitrarily without invoking any sort of copy constructor.

Random general advice! When writing patches, you typically explore the design space of potential solutions - which you did great in this case. When you do so, it's a great idea to include the results of such exploration in the commit message (and/or in later comments). Especially if the first / most obvious thing to try didn't work. If you publish your second-choice solution, people will most likely ask you why you chose it instead of the obvious first-choice solution so it's an act of friendliness to anticipate this question. Similarly, if you have concerns about your solution, please communicate them openly; they'll either be discovered anyway but a lot more effort will be spent on it, or they won't be discovered during review which could result in future problems; both variants are worse than communicating openly. Ideally you carry more responsibility for proving that your approach is right than the reviewer has a responsibility to find problems in it. The reviewer is the last line of defense when it comes to applying critical thinking but the process really starts with you as you challenge your own assumptions and approaches by questioning both their correctness and their architectural harmony. Trying to find at least two solutions to each problem and compare them against each other is such a good exercise in critical thinking that almost every patch deserves such discussion.

Put changes discussed in the meeting, tests to come in next revision

Added tests

How do I set the C++ standard while running a test?

In D103750#2812613, @RedDocMD wrote:

How do I set the C++ standard while running a test?

If I understand your comment correctly, then see my inline comment.

clang/test/Analysis/smart-ptr-text-output.cpp
9	You set the standard here via the `-std` arg. If you want different standards you need to have multiple `RUN:` lines that all pass a different standard.

Harbormaster completed remote builds in B108755: Diff 351356.Jun 11 2021, 2:07 AM

Fixed up tests, now also runnning on C++20

RedDocMD marked an inline comment as done.Jun 11 2021, 2:50 AM

RedDocMD added inline comments.

clang/test/Analysis/smart-ptr-text-output.cpp
9	Yup, thanks!

RedDocMD marked an inline comment as done.Jun 11 2021, 2:50 AM

RedDocMD retitled this revision from [analyzer][WIP] Handle std::make_unique for SmartPtrModeling to [analyzer] Handle std::make_unique for SmartPtrModeling.

Harbormaster completed remote builds in B108769: Diff 351376.Jun 11 2021, 4:03 AM

teemperor added inline comments.Jun 11 2021, 5:00 AM

clang/lib/StaticAnalyzer/Checkers/SmartPtrModeling.cpp
183	`getParent()` will assert when the MethodDecl is not a in a Record (which is the only way this can return null) so you can remove that check.

RedDocMD marked an inline comment as done.Jun 11 2021, 9:49 AM

Removed un-necessary check

Harbormaster completed remote builds in B108837: Diff 351485.Jun 11 2021, 10:36 AM

NoQ added inline comments.Jun 11 2021, 11:26 AM

clang/lib/StaticAnalyzer/Checkers/SmartPtrModeling.cpp
209–210	Untyped region isn't a region without a type; everything has a type. Untyped region is when we don't know the type. A typical situation that produces untyped region is when the region comes in through a void pointer. I vaguely remember that one way to trick your specific code may be to do std::unique_ptr<int> foo() { return make_unique<int>(123); } which will RVO into an unknown region. I also wouldn't rely on it being typed in all other cases. A much safer way to access the inner pointer type would be to query the function's template parameter.
222–225	I suggest a `TODO: ExprEngine should do this for us.`.
227	Do we need a note here as well? I guess we don't because we'll never emit null dereference reports against a non-null pointer. But if we later emit more sophisticated bug reports, we might need one. Maybe leave a comment?

RedDocMD marked 3 inline comments as done.Jun 11 2021, 9:15 PM

Fixed up technique used to find inner pointer type, put TODO's

Harbormaster completed remote builds in B108946: Diff 351628.Jun 11 2021, 10:35 PM

NoQ added inline comments.Jun 14 2021, 8:44 PM

clang/lib/StaticAnalyzer/Checkers/SmartPtrModeling.cpp
207	Can you do a `getConjuredHeapSymbolVal()` instead? That'd give us the right memory space as well as the extra bit of information that the new symbol doesn't alias with any previous symbols. I get it that it doesn't accept the type but it's perfectly ok for you to teach it how to accept the type.
clang/test/Analysis/smart-ptr-text-output.cpp
339	Mmm wait a sec, that doesn't look like what the spec says. https://en.cppreference.com/w/cpp/memory/unique_ptr/make_unique: Same as (1), except that the object is default-initialized. This overload participates in overload resolution only if T is not an array type. The function is equivalent to `unique_ptr<T>(new T)` It zero-initializes the pointee, not the pointer. The difference between `std::make_unique<A>()` and `std::make_unique_for_overwrite<A>()` is the difference between value-initialization (invoking the default constructor) and zero-initialization (simply filling the buffer with `0`s).

RedDocMD marked an inline comment as done.Jun 15 2021, 6:49 AM

RedDocMD added inline comments.

clang/test/Analysis/smart-ptr-text-output.cpp
339	Oops! Look like I completely misunderstood the spec. So basically, the inner pointer is not null, and we have the same assumptions that apply for `std::make_unique`. Only problem is that, for primitive types, the value obtained on de-referencing is undefined.

Fixed up meaning of make_unique_for_overwrite, use getConjuredHeapSymbolVal.

vsavchenko added inline comments.Jun 15 2021, 7:20 AM

clang/lib/StaticAnalyzer/Checkers/SmartPtrModeling.cpp
216–220	I think it's fine to have it here in this commit and change it in one of the future commits. But I think we should add a bit more description of 1) what's the problem, 2) how is this a solution, 3) how should it look when moved to the engine.

Harbormaster completed remote builds in B109286: Diff 352127.Jun 15 2021, 8:32 AM

This looks great to me, I think the patch is good to go as long as Valeriy's comment is addressed :)

Speaking of specs, hold up, we're forgetting something very important.

make_unique() also has to call the constructor of the underlying value. We're supposed to model that.

The constructor may be inlined or evaluated conservatively.

~unique_ptr() is also supposed to call the destructor of the underlying value. We're supposed to model that.

I.e., you could imagine a bunch of tests like this

struct S {
  int *p;
  S(int *p): p(p) {}
  ~S() { *p = 0; }
};

void foo() {
  int x = 1;
  {
    auto P = std::make_unique<S>(&x);
    clang_analyzer_eval(*P->p == 1); // TRUE
  }
  clang_analyzer_eval(x == 0); // TRUE
}

We could stick to conservative evaluation of such constructors and destructors; at the very least, we should actively invalidate Store at the destructor. This patch can remain unchanged because the default values inside freshly conjured regions are unknown. But that'd still be a regression because the above test (presumably) passes with inlining.

Is now a good time to legalize modeling function calls inside checker callbacks?

clang/test/Analysis/smart-ptr-text-output.cpp
4	Could you double-check that this flag correctly disables all the newly introduced modeling so that it wasn't accidentally enabled for all users before it's ready?

This revision is now accepted and ready to land.Jun 16 2021, 12:26 AM

Added more info to the TODO

RedDocMD marked an inline comment as done.Jun 16 2021, 9:57 PM

RedDocMD added inline comments.

clang/test/Analysis/smart-ptr-text-output.cpp
4	Yup. There are 133 notes and warnings (just search for "// expected-" and see the count). And on setting the flag to `false`, there are 133 errors. So it works.

RedDocMD marked an inline comment as done.Jun 16 2021, 9:58 PM

I would suppose that constructor calls are properly handled. (ie, member constructors are called properly).
As for modelling destructors, there is a separate problem - since we don't have a Stmt, the postCall handler keeps on crashing.

Harbormaster completed remote builds in B109647: Diff 352621.Jun 17 2021, 5:41 AM

NoQ added inline comments.Jun 17 2021, 11:16 AM

clang/test/Analysis/smart-ptr-text-output.cpp
4	I mean, specifically the modeling added by this patch? I see you wrote an entire checker callback that doesn't seem to have an analyzer-config option check anywhere in it. Simply having different results isn't sufficient to verify this.

In D103750#2823741, @RedDocMD wrote:

I would suppose that constructor calls are properly handled. (ie, member constructors are called properly).

Do the above tests pass when your new evalCall modeling is enabled?

I believe there are a couple of comments that are done but not marked accordingly.
I agree with Artem, if we could craft code that fails due to not calling ctors, we should probably include them in the tests, even if they don't reflect the desired behavior they are a great source of documentation what is missing.

clang/lib/StaticAnalyzer/Checkers/SmartPtrModeling.cpp
151	Nit: consider `TemplateArgs.empty()`.

RedDocMD marked an inline comment as done.Jun 18 2021, 10:34 AM

RedDocMD added inline comments.

clang/test/Analysis/smart-ptr-text-output.cpp
4	Oh I see.

RedDocMD marked an inline comment as done.Jun 18 2021, 10:35 AM

In D103750#2825342, @NoQ wrote:

In D103750#2823741, @RedDocMD wrote:

I would suppose that constructor calls are properly handled. (ie, member constructors are called properly).

Do the above tests pass when your new evalCall modeling is enabled?

The analyzer doesn't seem to be able to make up its mind.

member-constructor.cpp:15:5: warning: FALSE [debug.ExprInspection]
    clang_analyzer_eval(*P->p == 0);
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
member-constructor.cpp:15:25: note: Assuming the condition is false
    clang_analyzer_eval(*P->p == 0);
                        ^~~~~~~~~~
member-constructor.cpp:15:5: note: FALSE
    clang_analyzer_eval(*P->p == 0);
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
member-constructor.cpp:15:5: warning: TRUE [debug.ExprInspection]
    clang_analyzer_eval(*P->p == 0);
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
member-constructor.cpp:15:25: note: Assuming the condition is true
    clang_analyzer_eval(*P->p == 0);
                        ^~~~~~~~~~
member-constructor.cpp:15:5: note: TRUE
    clang_analyzer_eval(*P->p == 0);
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2 warnings generated.

RedDocMD marked an inline comment as done.Jun 18 2021, 10:49 AM

Little changes, a failing test

In D103750#2825823, @xazax.hun wrote:

I believe there are a couple of comments that are done but not marked accordingly.
I agree with Artem, if we could craft code that fails due to not calling ctors, we should probably include them in the tests, even if they don't reflect the desired behavior they are a great source of documentation what is missing.

Do you want the new failing test to be marked expected to fail?

In D103750#2827566, @RedDocMD wrote:

Do you want the new failing test to be marked expected to fail?

I usually just add the wrong expectation to make the test pass and add a TODO comment that explains why is this wrong and we should fix it in the future.

In D103750#2827566, @RedDocMD wrote:

Do you want the new failing test to be marked expected to fail?

The line of thinking here is that tests are just something that gives us a signal when the behavior changes. They don't necessarily demonstrate the expected behavior. If they don't there's still value in knowing that the observed behavior on them has changed.

For example, if you believe that your commit shouldn't change the observed behavior ("NFC commit", eg. refactoring), any signal that the observed behavior has changed should raise concerns and probably cause you to rethink the commit, even if the behavior has in fact improved.

Similarly, if the functional change is intended, you may still find it suspicious if it suddenly fixes a test that it wasn't supposed to fix. This may help you discover a bug in your commit that would cause the behavior to regress in other related cases.

You don't want the buggy test to be silenced, you want it to keep actively verifying that the bug is not yet fixed.

Then, separately, for every test there's a requirement to keep it understandable, so that the reader would understand what exactly is tested and what are the consequences of breaking the test. It's often valuable to provide comments indicating your level of confidence - "this test is extremely important to pass", "we don't really care about our behavior on this test but it's still nice to know when that behavior changes", etc. - and "we definitely don't want this test to pass but for now it does" is a possible answer as well and it doesn't make the test any less valuable.

So basically

void foo() {
  // FIXME: Should be FALSE.
  clang_analyzer_eval(1 + 1 == 3); // expected-warning{{TRUE}}
  // FIXME: Should be TRUE.
  clang_analyzer_eval(1 + 1 == 2); // expected-warning{{FALSE}}
}

is a great test to have.

Harbormaster completed remote builds in B109963: Diff 353048.Jun 19 2021, 5:23 AM

In D103750#2827548, @RedDocMD wrote:

In D103750#2825342, @NoQ wrote:

Do the above tests pass when your new evalCall modeling is enabled?

The analyzer doesn't seem to be able to make up its mind.

member-constructor.cpp:15:5: warning: FALSE [debug.ExprInspection]
    clang_analyzer_eval(*P->p == 0);
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
member-constructor.cpp:15:25: note: Assuming the condition is false
    clang_analyzer_eval(*P->p == 0);
                        ^~~~~~~~~~
member-constructor.cpp:15:5: note: FALSE
    clang_analyzer_eval(*P->p == 0);
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
member-constructor.cpp:15:5: warning: TRUE [debug.ExprInspection]
    clang_analyzer_eval(*P->p == 0);
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
member-constructor.cpp:15:25: note: Assuming the condition is true
    clang_analyzer_eval(*P->p == 0);
                        ^~~~~~~~~~
member-constructor.cpp:15:5: note: TRUE
    clang_analyzer_eval(*P->p == 0);
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2 warnings generated.

What about this?

In D103750#2828995, @RedDocMD wrote:

What about this?

I believe that you might see analyzer-eagerly-assume in action. Basically, instead of keeping a constraint unknown, the analyzer can split the path eagerly. So you have one path where the constraint is assumed to be true and one where it is assumed to be false. I think the rationale was that the user is likely to branch on certain conditions and doing it eagerly can make the analysis more precise in some cases but I do not really remember the details. Hopefully Artem has some more contexts.

clang/test/Analysis/smart-ptr-text-output.cpp
353	Nit: we usually add new lines to the ends of the files.

Yes, @xazax.hun is correct.

It's incorrect to say that the static analyzer "doesn't seem to be able to make up its mind". The analyzer gives perfectly clear and consistent answers for each execution path it explores and it's not surprising that the results are different on different execution paths. The presence of the FALSE path indicates indicates that the test indeed doesn't pass: an impossible execution path is being explored.

Eagerly-assume is a thing because it produces easier constraints for the solver to solve. Also the state split is pretty justified on any boolean expression with no side effects.

Fixing up tests

Is the syntax of specifying expected notes and warnings documented somewhere? I could not find the note-specific syntax.

Harbormaster completed remote builds in B113738: Diff 358267.Jul 13 2021, 8:45 AM

Post-rebase cleanup

Harbormaster completed remote builds in B114719: Diff 359605.Jul 18 2021, 3:03 AM

Fixed up tests

Marked test with FIXME notes

This revision was landed with ongoing or failed builds.Jul 18 2021, 7:25 AM

Closed by commit rGd825309352b4: [analyzer] Handle std::make_unique (authored by RedDocMD). · Explain Why

This revision was automatically updated to reflect the committed changes.

RedDocMD added a commit: rGd825309352b4: [analyzer] Handle std::make_unique.

Harbormaster completed remote builds in B114737: Diff 359627.Jul 18 2021, 7:57 AM

Revision Contents

Path

Size

clang/

include/

clang/

StaticAnalyzer/

Core/

PathSensitive/

SValBuilder.h

8 lines

lib/

StaticAnalyzer/

Checkers/

SmartPtrModeling.cpp

62 lines

Core/

SValBuilder.cpp

17 lines

test/

Analysis/

Inputs/

system-header-simulator-cxx.h

11 lines

smart-ptr-text-output.cpp

37 lines

Diff 352621

clang/include/clang/StaticAnalyzer/Core/PathSensitive/SValBuilder.h

Show First 20 Lines • Show All 240 Lines • ▼ Show 20 Lines	public:

/// Conjure a symbol representing heap allocated memory region.		/// Conjure a symbol representing heap allocated memory region.
///		///
/// Note, the expression should represent a location.		/// Note, the expression should represent a location.
DefinedOrUnknownSVal getConjuredHeapSymbolVal(const Expr *E,		DefinedOrUnknownSVal getConjuredHeapSymbolVal(const Expr *E,
const LocationContext *LCtx,		const LocationContext *LCtx,
unsigned Count);		unsigned Count);

		/// Conjure a symbol representing heap allocated memory region.
		///
		/// Note, now, the expression doesn't need to represent a location.
		/// But the type need to!
		DefinedOrUnknownSVal getConjuredHeapSymbolVal(const Expr *E,
		const LocationContext *LCtx,
		QualType type, unsigned Count);

DefinedOrUnknownSVal getDerivedRegionValueSymbolVal(		DefinedOrUnknownSVal getDerivedRegionValueSymbolVal(
SymbolRef parentSymbol, const TypedValueRegion *region);		SymbolRef parentSymbol, const TypedValueRegion *region);

DefinedSVal getMetadataSymbolVal(const void *symbolTag,		DefinedSVal getMetadataSymbolVal(const void *symbolTag,
const MemRegion *region,		const MemRegion *region,
const Expr *expr, QualType type,		const Expr *expr, QualType type,
const LocationContext *LCtx,		const LocationContext *LCtx,
unsigned count);		unsigned count);
▲ Show 20 Lines • Show All 161 Lines • Show Last 20 Lines

clang/lib/StaticAnalyzer/Checkers/SmartPtrModeling.cpp

Show All 29 Lines

#include "clang/StaticAnalyzer/Core/PathSensitive/SymExpr.h" #include "clang/StaticAnalyzer/Core/PathSensitive/SymExpr.h"

#include "clang/StaticAnalyzer/Core/PathSensitive/SymbolManager.h" #include "clang/StaticAnalyzer/Core/PathSensitive/SymbolManager.h"

#include <string> #include <string>

using namespace clang; using namespace clang;

using namespace ento; using namespace ento;

namespace { namespace {

class SmartPtrModeling class SmartPtrModeling

: public Checker<eval::Call, check::DeadSymbols, check::RegionChanges, : public Checker<eval::Call, check::DeadSymbols, check::RegionChanges,

check::LiveSymbols> { check::LiveSymbols> {

bool isBoolConversionMethod(const CallEvent &Call) const; bool isBoolConversionMethod(const CallEvent &Call) const;

public: public:

// Whether the checker should model for null dereferences of smart pointers. // Whether the checker should model for null dereferences of smart pointers.

Show All 25 Lines private:

using SmartPtrMethodHandlerFn = using SmartPtrMethodHandlerFn =

void (SmartPtrModeling::*)(const CallEvent &Call, CheckerContext &) const; void (SmartPtrModeling::*)(const CallEvent &Call, CheckerContext &) const;

CallDescriptionMap<SmartPtrMethodHandlerFn> SmartPtrMethodHandlers{ CallDescriptionMap<SmartPtrMethodHandlerFn> SmartPtrMethodHandlers{

{{"reset"}, &SmartPtrModeling::handleReset}, {{"reset"}, &SmartPtrModeling::handleReset},

{{"release"}, &SmartPtrModeling::handleRelease}, {{"release"}, &SmartPtrModeling::handleRelease},

{{"swap", 1}, &SmartPtrModeling::handleSwap}, {{"swap", 1}, &SmartPtrModeling::handleSwap},

{{"get"}, &SmartPtrModeling::handleGet}}; {{"get"}, &SmartPtrModeling::handleGet}};

const CallDescription StdMakeUniqueCall{{"std", "make_unique"}};

const CallDescription StdMakeUniqueForOverwriteCall{

{"std", "make_unique_for_overwrite"}};

}; };

} // end of anonymous namespace } // end of anonymous namespace

REGISTER_MAP_WITH_PROGRAMSTATE(TrackedRegionMap, const MemRegion *, SVal) REGISTER_MAP_WITH_PROGRAMSTATE(TrackedRegionMap, const MemRegion *, SVal)

// Define the inter-checker API. // Define the inter-checker API.

namespace clang { namespace clang {

namespace ento { namespace ento {

▲ Show 20 Lines • Show All 43 Lines • ▼ Show 20 Lines static ProgramStateRef updateSwappedRegion(ProgramStateRef State,

if (RegionInnerPointerVal) { if (RegionInnerPointerVal) {

State = State->set<TrackedRegionMap>(Region, *RegionInnerPointerVal); State = State->set<TrackedRegionMap>(Region, *RegionInnerPointerVal);

} else { } else {

State = State->remove<TrackedRegionMap>(Region); State = State->remove<TrackedRegionMap>(Region);

} }

return State; return State;

} }

// This is for use with standalone-functions like std::make_unique,

// std::make_unique_for_overwrite, etc. It reads the template parameter and

// returns the pointer type corresponding to it,

static QualType getPointerTypeFromTemplateArg(const CallEvent &Call,

CheckerContext &C) {

const auto *FD = dyn_cast_or_null<FunctionDecl>(Call.getDecl());

if (!FD || !FD->isFunctionTemplateSpecialization())

return {};

const auto &TemplateArgs = FD->getTemplateSpecializationArgs()->asArray();

if (TemplateArgs.size() == 0)

xazax.hunUnsubmitted

Done

Nit: consider TemplateArgs.empty().

xazax.hun: Nit: consider `TemplateArgs.empty()`.

return {};

auto ValueType = TemplateArgs[0].getAsType();

return C.getASTContext().getPointerType(ValueType.getCanonicalType());

}

// Helper method to get the inner pointer type of specialized smart pointer // Helper method to get the inner pointer type of specialized smart pointer

// Returns empty type if not found valid inner pointer type. // Returns empty type if not found valid inner pointer type.

static QualType getInnerPointerType(const CallEvent &Call, CheckerContext &C) { static QualType getInnerPointerType(const CallEvent &Call, CheckerContext &C) {

const auto *MethodDecl = dyn_cast_or_null<CXXMethodDecl>(Call.getDecl()); const auto *MethodDecl = dyn_cast_or_null<CXXMethodDecl>(Call.getDecl());

if (!MethodDecl || !MethodDecl->getParent()) if (!MethodDecl || !MethodDecl->getParent())

return {}; return {};

const auto *RecordDecl = MethodDecl->getParent(); const auto *RecordDecl = MethodDecl->getParent();

Show All 10 Lines static QualType getInnerPointerType(const CallEvent &Call, CheckerContext &C) {

auto InnerValueType = TemplateArgs[0].getAsType(); auto InnerValueType = TemplateArgs[0].getAsType();

return C.getASTContext().getPointerType(InnerValueType.getCanonicalType()); return C.getASTContext().getPointerType(InnerValueType.getCanonicalType());

} }

// Helper method to pretty print region and avoid extra spacing. // Helper method to pretty print region and avoid extra spacing.

static void checkAndPrettyPrintRegion(llvm::raw_ostream &OS, static void checkAndPrettyPrintRegion(llvm::raw_ostream &OS,

const MemRegion *Region) { const MemRegion *Region) {

if (Region->canPrintPretty()) { if (Region->canPrintPretty()) {

OS << " "; OS << " ";

teemperorUnsubmitted

Done

getParent() will assert when the MethodDecl is not a in a Record (which is the only way this can return null) so you can remove that check.

teemperor: `getParent()` will assert when the MethodDecl is not a in a Record (which is the only way this…

Region->printPretty(OS); Region->printPretty(OS);

} }

bool SmartPtrModeling::isBoolConversionMethod(const CallEvent &Call) const { bool SmartPtrModeling::isBoolConversionMethod(const CallEvent &Call) const {

// TODO: Update CallDescription to support anonymous calls? // TODO: Update CallDescription to support anonymous calls?

// TODO: Handle other methods, such as .get() or .release(). // TODO: Handle other methods, such as .get() or .release().

// But once we do, we'd need a visitor to explain null dereferences // But once we do, we'd need a visitor to explain null dereferences

// that are found via such modeling. // that are found via such modeling.

const auto *CD = dyn_cast_or_null<CXXConversionDecl>(Call.getDecl()); const auto *CD = dyn_cast_or_null<CXXConversionDecl>(Call.getDecl());

return CD && CD->getConversionType()->isBooleanType(); return CD && CD->getConversionType()->isBooleanType();

} }

bool SmartPtrModeling::evalCall(const CallEvent &Call, bool SmartPtrModeling::evalCall(const CallEvent &Call,

CheckerContext &C) const { CheckerContext &C) const {

NoQUnsubmitted

Done

Please use CallDescription for most of this stuff.

NoQ: Please use `CallDescription` for most of this stuff.

ProgramStateRef State = C.getState(); ProgramStateRef State = C.getState();

if (Call.isCalled(StdMakeUniqueCall) ||

Call.isCalled(StdMakeUniqueForOverwriteCall)) {

const Optional<SVal> ThisRegionOpt = Call.getReturnValueUnderConstruction();

if (!ThisRegionOpt)

return false;

NoQUnsubmitted

Done

Can you do a getConjuredHeapSymbolVal() instead? That'd give us the right memory space as well as the extra bit of information that the new symbol doesn't alias with any previous symbols.

I get it that it doesn't accept the type but it's perfectly ok for you to teach it how to accept the type.

NoQ: Can you do a `getConjuredHeapSymbolVal()` instead? That'd give us the right memory space as…

const auto PtrVal = C.getSValBuilder().getConjuredHeapSymbolVal(

Call.getOriginExpr(), C.getLocationContext(),

getPointerTypeFromTemplateArg(Call, C), C.blockCount());

NoQUnsubmitted

Done

Untyped region isn't a region without a type; everything has a type. Untyped region is when we don't know the type. A typical situation that produces untyped region is when the region comes in through a void pointer.

I vaguely remember that one way to trick your specific code may be to do

std::unique_ptr<int> foo() {
  return make_unique<int>(123);
}

which will RVO into an unknown region. I also wouldn't rely on it being typed in all other cases.

A much safer way to access the inner pointer type would be to query the function's template parameter.

NoQ: Untyped region isn't a region without a type; everything has a type. Untyped region is when we…

const MemRegion *ThisRegion = ThisRegionOpt->getAsRegion();

State = State->set<TrackedRegionMap>(ThisRegion, PtrVal);

State = State->assume(PtrVal, true);

// TODO: ExprEngine should do this for us.

// For a bit more context:

// 1) Why do we need this? Since we are modelling a "function"

// that returns a constructed object we need to store this information in

// the program state.

vsavchenkoUnsubmitted

Not Done

I think it's fine to have it here in this commit and change it in one of the future commits. But I think we should add a bit more description of 1) what's the problem, 2) how is this a solution, 3) how should it look when moved to the engine.

vsavchenko: I think it's fine to have it here in this commit and change it in one of the future commits.

// 2) Why does this work?

// `updateObjectsUnderConstruction` does exactly as it sounds.

// 3) How should it look like when moved to the Engine?

NoQUnsubmitted

Done

I suggest a TODO: ExprEngine should do this for us..

NoQ: I suggest a `TODO: ExprEngine should do this for us.`.

// It would be nice if we can just

// pretend we don't need to know about this - ie, completely automatic work.

NoQUnsubmitted

Done

Do we need a note here as well? I guess we don't because we'll never emit null dereference reports against a non-null pointer. But if we later emit more sophisticated bug reports, we might need one. Maybe leave a comment?

NoQ: Do we need a note here as well? I guess we don't because we'll never emit null dereference…

// However, realistically speaking, I think we would need to "signal" the

// ExprEngine evalCall handler that we are constructing an object with this

// function call (constructors obviously construct, hence can be

// automatically deduced).

auto &Engine = State->getStateManager().getOwningEngine();

State = Engine.updateObjectsUnderConstruction(

*ThisRegionOpt, nullptr, State, C.getLocationContext(),

Call.getConstructionContext(), {});

// We don't leave a note here since it is guaranteed the

// unique_ptr from this call is non-null (hence is safe to de-reference).

C.addTransition(State);

return true;

}

if (!smartptr::isStdSmartPtrCall(Call)) if (!smartptr::isStdSmartPtrCall(Call))

return false; return false;

if (isBoolConversionMethod(Call)) { if (isBoolConversionMethod(Call)) {

const MemRegion *ThisR = const MemRegion *ThisR =

cast<CXXInstanceCall>(&Call)->getCXXThisVal().getAsRegion(); cast<CXXInstanceCall>(&Call)->getCXXThisVal().getAsRegion();

if (ModelSmartPtrDereference) { if (ModelSmartPtrDereference) {

▲ Show 20 Lines • Show All 88 Lines • ▼ Show 20 Lines void SmartPtrModeling::checkDeadSymbols(SymbolReaper &SymReaper,

ProgramStateRef State = C.getState(); ProgramStateRef State = C.getState();

// Clean up dead regions from the region map. // Clean up dead regions from the region map.

TrackedRegionMapTy TrackedRegions = State->get<TrackedRegionMap>(); TrackedRegionMapTy TrackedRegions = State->get<TrackedRegionMap>();

for (auto E : TrackedRegions) { for (auto E : TrackedRegions) {

const MemRegion *Region = E.first; const MemRegion *Region = E.first;

bool IsRegDead = !SymReaper.isLiveRegion(Region); bool IsRegDead = !SymReaper.isLiveRegion(Region);

if (IsRegDead) if (IsRegDead)

State = State->remove<TrackedRegionMap>(Region); State = State->remove<TrackedRegionMap>(Region);

NoQUnsubmitted

Done

Please merge with isStdSmartPtrCall.

NoQ: Please merge with `isStdSmartPtrCall`.

RedDocMDAuthorUnsubmitted

Done

I don't think that can be done because isStdSmartPtrCall checks calls like a.foo(), checking if a is a smart-ptr. On the other hand I need to check for statements like a = foo() and ensure that a is a smart-ptr.

RedDocMD: I don't think that can be done because `isStdSmartPtrCall` checks //calls// like `a.foo()`…

} }

C.addTransition(State); C.addTransition(State);

} }

NoQUnsubmitted

Done

bool isUniquePtrType(QualType QT) {

- const auto *T = QT.getTypePtr();

- if (!T)

+ if (QT.isNull())

return false;

- const auto *Decl = T->getAsCXXRecordDecl();

+ const auto *Decl = QT->getAsCXXRecordDecl();

if (!Decl || !Decl->getDeclContext()->isStdNamespace())

getTypePtr() is almost never necessary because QualType has an overloaded operator->().

NoQ: `getTypePtr()` is almost never necessary because `QualType` has an overloaded `operator->()`.

void SmartPtrModeling::printState(raw_ostream &Out, ProgramStateRef State, void SmartPtrModeling::printState(raw_ostream &Out, ProgramStateRef State,

const char *NL, const char *Sep) const { const char *NL, const char *Sep) const {

TrackedRegionMapTy RS = State->get<TrackedRegionMap>(); TrackedRegionMapTy RS = State->get<TrackedRegionMap>();

if (!RS.isEmpty()) { if (!RS.isEmpty()) {

Out << Sep << "Smart ptr regions :" << NL; Out << Sep << "Smart ptr regions :" << NL;

for (auto I : RS) { for (auto I : RS) {

I.first->dumpToStream(Out); I.first->dumpToStream(Out);

▲ Show 20 Lines • Show All 350 Lines • Show Last 20 Lines

clang/lib/StaticAnalyzer/Core/SValBuilder.cpp

Show First 20 Lines • Show All 186 Lines • ▼ Show 20 Lines	DefinedOrUnknownSVal SValBuilder::conjureSymbolVal(const Stmt *stmt,
return nonloc::SymbolVal(sym);		return nonloc::SymbolVal(sym);
}		}

DefinedOrUnknownSVal		DefinedOrUnknownSVal
SValBuilder::getConjuredHeapSymbolVal(const Expr *E,		SValBuilder::getConjuredHeapSymbolVal(const Expr *E,
const LocationContext *LCtx,		const LocationContext *LCtx,
unsigned VisitCount) {		unsigned VisitCount) {
QualType T = E->getType();		QualType T = E->getType();
assert(Loc::isLocType(T));		return getConjuredHeapSymbolVal(E, LCtx, T, VisitCount);
assert(SymbolManager::canSymbolicate(T));		}
if (T->isNullPtrType())
return makeZeroVal(T);		DefinedOrUnknownSVal
		SValBuilder::getConjuredHeapSymbolVal(const Expr *E,
		const LocationContext *LCtx,
		QualType type, unsigned VisitCount) {
		assert(Loc::isLocType(type));
		assert(SymbolManager::canSymbolicate(type));
		if (type->isNullPtrType())
		return makeZeroVal(type);

SymbolRef sym = SymMgr.conjureSymbol(E, LCtx, T, VisitCount);		SymbolRef sym = SymMgr.conjureSymbol(E, LCtx, type, VisitCount);
return loc::MemRegionVal(MemMgr.getSymbolicHeapRegion(sym));		return loc::MemRegionVal(MemMgr.getSymbolicHeapRegion(sym));
}		}

DefinedSVal SValBuilder::getMetadataSymbolVal(const void *symbolTag,		DefinedSVal SValBuilder::getMetadataSymbolVal(const void *symbolTag,
const MemRegion *region,		const MemRegion *region,
const Expr *expr, QualType type,		const Expr *expr, QualType type,
const LocationContext *LCtx,		const LocationContext *LCtx,
unsigned count) {		unsigned count) {
▲ Show 20 Lines • Show All 678 Lines • Show Last 20 Lines

clang/test/Analysis/Inputs/system-header-simulator-cxx.h

Show First 20 Lines • Show All 972 Lines • ▼ Show 20 Lines	public:
unique_ptr<T> &operator=(nullptr_t) noexcept;		unique_ptr<T> &operator=(nullptr_t) noexcept;
};		};

// TODO :: Once the deleter parameter is added update with additional template parameter.		// TODO :: Once the deleter parameter is added update with additional template parameter.
template <typename T>		template <typename T>
void swap(unique_ptr<T> &x, unique_ptr<T> &y) noexcept {		void swap(unique_ptr<T> &x, unique_ptr<T> &y) noexcept {
x.swap(y);		x.swap(y);
}		}

		template <class T, class... Args>
		unique_ptr<T> make_unique(Args &&...args);

		#if __cplusplus >= 202002L

		template <class T>
		unique_ptr<T> make_unique_for_overwrite();

		#endif

} // namespace std		} // namespace std
#endif		#endif

#ifdef TEST_INLINABLE_ALLOCATORS		#ifdef TEST_INLINABLE_ALLOCATORS
namespace std {		namespace std {
void *malloc(size_t);		void *malloc(size_t);
void free(void *);		void free(void *);
}		}
▲ Show 20 Lines • Show All 154 Lines • Show Last 20 Lines

clang/test/Analysis/smart-ptr-text-output.cpp

	// RUN: %clang_analyze_cc1\			// RUN: %clang_analyze_cc1\
	// RUN: -analyzer-checker=core,cplusplus.Move,alpha.cplusplus.SmartPtr\			// RUN: -analyzer-checker=core,cplusplus.Move,alpha.cplusplus.SmartPtr\
	// RUN: -analyzer-config cplusplus.SmartPtrModeling:ModelSmartPtrDereference=true\			// RUN: -analyzer-config cplusplus.SmartPtrModeling:ModelSmartPtrDereference=true\
				// RUN: -analyzer-output=text -std=c++20 %s -verify=expected
				NoQUnsubmitted Done Reply Inline Actions Could you double-check that this flag correctly disables all the newly introduced modeling so that it wasn't accidentally enabled for all users before it's ready? NoQ: Could you double-check that this flag correctly disables all the newly introduced modeling so…
				RedDocMDAuthorUnsubmitted Done Reply Inline Actions Yup. There are 133 notes and warnings (just search for "// expected-" and see the count). And on setting the flag to `false`, there are 133 errors. So it works. RedDocMD: Yup. There are 133 notes and warnings (just search for "// expected-" and see the count). And…
				NoQUnsubmitted Done Reply Inline Actions I mean, specifically the modeling added by this patch? I see you wrote an entire checker callback that doesn't seem to have an analyzer-config option check anywhere in it. Simply having different results isn't sufficient to verify this. NoQ: I mean, specifically the modeling added by this patch? I see you wrote an entire checker…
				RedDocMDAuthorUnsubmitted Done Reply Inline Actions Oh I see. RedDocMD: Oh I see.

				// RUN: %clang_analyze_cc1\
				// RUN: -analyzer-checker=core,cplusplus.Move,alpha.cplusplus.SmartPtr\
				// RUN: -analyzer-config cplusplus.SmartPtrModeling:ModelSmartPtrDereference=true\
	// RUN: -analyzer-output=text -std=c++11 %s -verify=expected			// RUN: -analyzer-output=text -std=c++11 %s -verify=expected
				teemperorUnsubmitted Done Reply Inline Actions You set the standard here via the `-std` arg. If you want different standards you need to have multiple `RUN:` lines that all pass a different standard. teemperor: You set the standard here via the `-std` arg. If you want different standards you need to have…
				RedDocMDAuthorUnsubmitted Done Reply Inline Actions Yup, thanks! RedDocMD: Yup, thanks!

	#include "Inputs/system-header-simulator-cxx.h"			#include "Inputs/system-header-simulator-cxx.h"

	class A {			class A {
	public:			public:
	A(){};			A(){};
	void foo();			void foo();
	};			};
	▲ Show 20 Lines • Show All 295 Lines • ▼ Show 20 Lines
	void derefAfterBranchingOnUnknownInnerPtr(std::unique_ptr<A> P) {			void derefAfterBranchingOnUnknownInnerPtr(std::unique_ptr<A> P) {
	A *RP = P.get();			A *RP = P.get();
	if (!RP) { // expected-note {{Assuming 'RP' is null}}			if (!RP) { // expected-note {{Assuming 'RP' is null}}
	// expected-note@-1 {{Taking true branch}}			// expected-note@-1 {{Taking true branch}}
	P->foo(); // expected-warning {{Dereference of null smart pointer 'P' [alpha.cplusplus.SmartPtr]}}			P->foo(); // expected-warning {{Dereference of null smart pointer 'P' [alpha.cplusplus.SmartPtr]}}
	// expected-note@-1{{Dereference of null smart pointer 'P'}}			// expected-note@-1{{Dereference of null smart pointer 'P'}}
	}			}
	}			}

				void makeUniqueReturnsNonNullUniquePtr() {
				auto P = std::make_unique<A>();
				if (!P) { // expected-note {{Taking false branch}}
				P->foo(); // should have no warning here, path is impossible
				}
				P.reset(); // expected-note {{Smart pointer 'P' reset using a null value}}
				// Now P is null
				if (!P) {
				// expected-note@-1 {{Taking true branch}}
				P->foo(); // expected-warning {{Dereference of null smart pointer 'P' [alpha.cplusplus.SmartPtr]}}
				// expected-note@-1{{Dereference of null smart pointer 'P'}}
				}
				}

				#if __cplusplus >= 202002L

				void makeUniqueForOverwriteReturnsNullUniquePtr() {
				auto P = std::make_unique_for_overwrite<A>();
				NoQUnsubmitted Not Done Reply Inline Actions Mmm wait a sec, that doesn't look like what the spec says. https://en.cppreference.com/w/cpp/memory/unique_ptr/make_unique: Same as (1), except that the object is default-initialized. This overload participates in overload resolution only if T is not an array type. The function is equivalent to `unique_ptr<T>(new T)` It zero-initializes the pointee, not the pointer. The difference between `std::make_unique<A>()` and `std::make_unique_for_overwrite<A>()` is the difference between value-initialization (invoking the default constructor) and zero-initialization (simply filling the buffer with `0`s). NoQ: Mmm wait a sec, that doesn't look like what the spec says. https://en.cppreference.
				RedDocMDAuthorUnsubmitted Done Reply Inline Actions Oops! Look like I completely misunderstood the spec. So basically, the inner pointer is not null, and we have the same assumptions that apply for `std::make_unique`. Only problem is that, for primitive types, the value obtained on de-referencing is undefined. RedDocMD: Oops! Look like I completely misunderstood the spec. So basically, the inner pointer is //not//…
				if (!P) { // expected-note {{Taking false branch}}
				P->foo(); // should have no warning here, path is impossible
				}
				P.reset(); // expected-note {{Smart pointer 'P' reset using a null value}}
				// Now P is null
				if (!P) {
				// expected-note@-1 {{Taking true branch}}
				P->foo(); // expected-warning {{Dereference of null smart pointer 'P' [alpha.cplusplus.SmartPtr]}}
				// expected-note@-1{{Dereference of null smart pointer 'P'}}
				}
				}

				#endif
				xazax.hunUnsubmitted Not Done Reply Inline Actions Nit: we usually add new lines to the ends of the files. xazax.hun: Nit: we usually add new lines to the ends of the files.

This is an archive of the discontinued LLVM Phabricator instance.

[analyzer] Handle std::make_unique for SmartPtrModelingClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 352621

clang/include/clang/StaticAnalyzer/Core/PathSensitive/SValBuilder.h

clang/lib/StaticAnalyzer/Checkers/SmartPtrModeling.cpp

clang/lib/StaticAnalyzer/Core/SValBuilder.cpp

clang/test/Analysis/Inputs/system-header-simulator-cxx.h

clang/test/Analysis/smart-ptr-text-output.cpp

[analyzer] Handle std::make_unique for SmartPtrModeling
ClosedPublic