This is an archive of the discontinued LLVM Phabricator instance.

[analyzer] Process non-POD array element destructors
ClosedPublic

Authored by isuckatcs on Jul 28 2022, 5:19 PM.

Details

Summary

The construction of the elements is evaluated under certain conditions.
This patch makes sure that the destructors are also evaluated in such cases.

Diff Detail

Event Timeline

isuckatcs created this revision.Jul 28 2022, 5:19 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 28 2022, 5:19 PM
isuckatcs requested review of this revision.Jul 28 2022, 5:19 PM
isuckatcs added inline comments.Jul 28 2022, 5:28 PM
clang/lib/StaticAnalyzer/Core/RegionStore.cpp
2376–2391 ↗(On Diff #448479)

This is something I couldn't find a solution for so far. I guess removing the
this region should happen in ExprEngine::processCallExit(). In Step 3, LastSt
is a nullptr, so the else branch is hit. Calling removeDead(), killBinding() or
invalidateRegions() from that branch all lead to a crash.

Attempts to remove the region (if exists) before inlining the dtor also failed.

isuckatcs updated this revision to Diff 448599.Jul 29 2022, 5:43 AM
isuckatcs edited the summary of this revision. (Show Details)

Handling destructors invoked by delete expression.

isuckatcs updated this revision to Diff 448623.Jul 29 2022, 7:05 AM
isuckatcs edited the summary of this revision. (Show Details)

Added support for evaluating member dtors.

isuckatcs updated this revision to Diff 448645.Jul 29 2022, 8:56 AM
isuckatcs edited the summary of this revision. (Show Details)

Added simple test cases.

isuckatcs updated this revision to Diff 448737.Jul 29 2022, 4:11 PM

Fixed the issue that caused crashes.

NoQ added a comment.Jul 31 2022, 11:26 PM

Hey folks, I'm mostly back in action. Sounds like I missed a lot!

clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
1303–1304

It looks like the only reason you need to getHeapRegionInitializer() is because you want a Ty. Why isn't DTy sufficient in this case? It seems counter-intuitive that we can't tell the *type* of the object we're about to destroy, when we also have a CXXDestructorDecl from which we can, at least, get to the parent CXXRecordDecl which is as close to the type as it gets.

clang/lib/StaticAnalyzer/Core/RegionStore.cpp
2376–2391 ↗(On Diff #448479)

Oh interesting, looks like removeDead() wants a statement which we don't have, neither in the caller (there's no call-expression) nor in the callee (the callee body is empty).

What program point are we using anyway? We might need a new ***PurgeDeadSymbols program point just for that purpose. It seems daunting to add a new program point just because we're hitting an assertion but it's generally a pretty good assertion and it looks like we can get some better quality diagnostics if we make this work.

isuckatcs marked an inline comment as done.Aug 1 2022, 5:30 AM
isuckatcs added inline comments.
clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
1303–1304

The destructors are treated as methods, so as a result the type we have access to is the type of the instance itself, on which we call the method.
For example:

Aggregate *arr = new Aggregate[4];
delete[] arr;

When delete[] is called, DTy is

RecordType 'struct Aggregate'
`-CXXRecord 'Aggregate'

So we have access to this type, and we have access to DE->isArrayForm(), however these 2 informations are still not sufficient to decide if we want to inline the destructor or not. To do that we need to know the size of the array, or we need to know if we have inlined the construction before.

Ty on the other hand stores the type of the expression that initialized the region, which is the CXXConstructExpr inside a CXXNewExpr in most case. This expression has the information about the size of the array.

The CXXNewExpr for the example snippet looks like this:

CXXNewExpr 'struct Aggregate *' array Function 'operator new[]' 'void *(unsigned long)'
|-ImplicitCastExpr 'unsigned long' <IntegralCast>
| `-IntegerLiteral 'int' 4
`-CXXConstructExpr 'struct Aggregate[4]' 'void (void) noexcept'

Then NewExpr->getInitializer() returns CXXConstructExpr 'struct Aggregate[4]' 'void (void) noexcept', the type of which, and hence Ty is :

ConstantArrayType 'struct Aggregate[4]' 4 
`-RecordType 'struct Aggregate'
  `-CXXRecord 'Aggregate'

Using this type we have all the necessary information required for making a decision about inlining.

I agree that this is not the cleanest approach and it has flaws like the initializer is not removed from the map if the developer forgets to free the allocated memory. Also in some cases, like calling placement new, the initializer is added to the map twice, and is removed only once.

Then we have the issue I mentioned in the comment in the code about the region being allocated using ::operator new[], though that might be ignored since calling delete[] instead of ::operator delete[] on that region is illegal anyway. And in that case the destructors are supposed to be called manually on each element, which we model as regular method calls, so it doesn't require any additional work to evaluate them.

Also note that the initializer is paired with the region itself, and not the declaration which allows us to reason about snippets like this:

Aggregate *foo() {
    auto *arr = new Aggregate[4];
    return arr;
}

int main() {
    Aggregate *x = foo();
    auto *y = x;
    delete[] y;

    return 0;
}

In this snippet delete[] receives a pointer, which has been initialized by another pointer and this can go on and on. In this case I think we really need to know about what initialized the region being freed in the first place, otherwise there's no way to figure it out when we reach delete[].

isuckatcs marked an inline comment as done.Aug 1 2022, 5:32 AM
NoQ added inline comments.Aug 1 2022, 1:31 PM
clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
1303–1304

Ok so you're really interested only in the size. In this case looking up the initializer expression doesn't always help: the size isn't necessarily a compile-time constant (but it can still be a known, concrete integer on the current path).

It sounds like the right solution is to query the region's extent (https://clang.llvm.org/doxygen/DynamicExtent_8h_source.html). That's a state trait that keeps track of dynamically allocated memory size, in bytes. Statically allocated regions don't need their extent tracked, it can be determined at any time, but they can be obtained through the same interface. If the extent is a concrete integer, you can divide it by the size of the class to figure out the allocation size.

This seems to be somewhat consistent with the actual runtime behavior. Operator new[] stores the allocation size and then delete[] looks it up. It's part of the program memory, we're modeling it with a state trait. The actual program stores the size, we should also store the size.

Side note, the actual program typically stores the size contiguously in the same memory block that's returned from operator new[], so the actual allocation size becomes a bit bigger (the actual difference is unspecified by the standard) but we don't need to model it so precisely, let's instead say that the size is stored *somewhere* (in the state trait). Unless we're in a @martong's checker (D129280).

isuckatcs added inline comments.Aug 1 2022, 2:33 PM
clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
1303–1304

Ok so you're really interested only in the size. In this case looking up the initializer expression doesn't always help: the size isn't necessarily a compile-time constant (but it can still be a known, concrete integer on the current path).

I'm only interested in the size, because in ExprEngine::shouldInlineArrayConstruction() (D127973) that's what is used to decide whether we inline the constructors or not. We only want to inline the destructors if the constructors have been inlined I guess. I mean destructors are mostly used to clean up resources, so if we don't model the resources because the constructors haven't been inlined, we can't reason about cleaning them up either.

Operator new[] stores the allocation size and then delete[] looks it up. It's part of the program memory, we're modeling it with a state trait.

I knew that's how new[] and delete[] works, I just wasn't aware that we already model it. I wanted to replicate the same behaviour by creating that trait that stores the initialzer, which stores the size and more (I was thinking about the stored initializer as a record that stores the information about the allocated region).

so the actual allocation size becomes a bit bigger (the actual difference is unspecified by the standard)

In my understanding this 'overhead' has been removed from the standard as I mentioned it in D129280
(in an inline comment), though I might be wrong about this one.

isuckatcs updated this revision to Diff 449165.Aug 1 2022, 7:02 PM
isuckatcs marked an inline comment as done.
  • Rebased
  • Using DynamicExtent to get the region size where needed
  • Removed HeapRegionInitializer related changes
NoQ added inline comments.Aug 1 2022, 10:14 PM
clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
210

Hmmm. This key doesn't really allow us to uniquely identify the destructor that we're calling. We can have two arrays of the same type in the same scope, they'll have the same destructor decl and the same location context. I guess they still can't happen *simultaneously* in the same location context, so we still know which one we're dealing with, right? But in this case, do we need the decl as part of the key at all? Maybe just location context could be sufficient?

1303–1304

Aha, oh right sorry, I remember you commenting on that patch! In any case, yay, this looks nice!

clang/lib/StaticAnalyzer/Core/ExprEngineCallAndReturn.cpp
274

A comment about what may cause this to happen would be great!

martong added inline comments.Aug 2 2022, 7:36 AM
clang/test/Analysis/dtor-array.cpp
10–11

These two variables seem to be redundant to me. Maybe it is just me, but this makes the tests convoluted and harder to read. Wouldn't it be enough to simply check in each test ?

clang_analyzer_eval(InlineDtor::dtorCalled == 3); // expected-warning {{TRUE}}

You reset dtorCalled anyway at the beginning of each test.

isuckatcs updated this revision to Diff 449291.Aug 2 2022, 7:59 AM
isuckatcs marked 3 inline comments as done.
  • Destructor inlining only depends on DynamicExtent.
  • Cleaned up repeated code in process_____Dtor functions.
  • CXXDestructorDecl is no longer stored as part of a key.
  • Added an additional test case.
clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
210

LocationContext seems to be sufficient, so I changed the map so that it's the only key. If we encounter a crash or an issue later, we can always change it to something else.

clang/lib/StaticAnalyzer/Core/ExprEngineCallAndReturn.cpp
274

It doesn't happen anymore.

isuckatcs added inline comments.Aug 2 2022, 8:10 AM
clang/test/Analysis/dtor-array.cpp
10–11

I want to test for if the destructor is called for every element and not for the same element multiple times. Somehow I want to make sure that the destructor of each element does something different.

The idea is that cnt always starts from a different number, so this way I can make sure that between 2 functions, the values of a, b, c and d actually change and I don't check for the results of a previous function.

I also want to have a variable that tracks the number of destructor invocations, so I can make sure that each invocation does something different. This variable is dtorCalled.

Testing the destructors is tricky, as they only cause side effects and I couldn't find a better solution to separate everything properly.

xazax.hun added inline comments.Aug 2 2022, 10:40 AM
clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
211

Would it make sense to also include this in the ExplodedGraph dump?

isuckatcs updated this revision to Diff 449419.Aug 2 2022, 1:50 PM

Destructors are now called in the proper order.

isuckatcs updated this revision to Diff 449697.Aug 3 2022, 10:12 AM
isuckatcs marked an inline comment as done.

Introduced a solution to the assertion failure in RegionStoreManager::bind().

clang/lib/StaticAnalyzer/Core/RegionStore.cpp
2376–2391 ↗(On Diff #448479)

I found that our dead symbol removing strategies are centered around expressions, so introducing a new strategy that doesn't require and expression would be a lot of work (at least a lot more than the current solution).

Since we only encounter this issue with default destructors I think it doesn't make sense to rework our existing solutions. Instead I introduced a domain specific solution focusing only on this problem.

NoQ added a comment.Aug 3 2022, 9:03 PM

Aha great, everything makes sense now! I have a final round of minor comments.

clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
211

Yes!

That's a boring piece of work but it helps immensely when debugging bugs.

clang/lib/StaticAnalyzer/Core/ExprEngineCallAndReturn.cpp
294

There appears to be a convenient wrapper for this operation:

newState = state->killBinding(ThisVal);
301

Do we actually need that? The contents of the region don't actually change, they're simply "forgotten", there are no writes or invalidations happening for the checkers to react to.

isuckatcs marked 2 inline comments as done.Aug 4 2022, 12:11 AM
isuckatcs added inline comments.
clang/lib/StaticAnalyzer/Core/ExprEngineCallAndReturn.cpp
294

Sadly that wrapper cannot be used on regions. In the first line there is an assertion that prevents us doing so.

assert(!isa<loc::MemRegionVal>(LV) && "Use invalidateRegion instead.");

ProgramState::invalidateRegions() on the other hand takes a const Expr *, which cannot be a nullptr, otherwhise clang crashes. This is what I meant by we really seem to depend on expressions when it comes to invalidation. I guess this method is prefered because it invalidates any additional region that depends on the region we initially want to invalidate, which is the this region in this case.

Also we have ProgramState::makeWithStore() which would be useful, however it is protected.

Though I aggree that this snippet does the same as ProgramState::killBinding() but bypassing the assertion.

301

In ProgramState::invalidateRegionsImpl() we call it in the end. Though that function calls Mgr.StoreMgr->invalidateRegions() which says:

invalidateRegions - Clears out the specified regions from the store, marking their values as unknown...

So we might indeed not need that.

isuckatcs marked 2 inline comments as done.Aug 4 2022, 12:11 AM
martong added inline comments.Aug 4 2022, 3:03 AM
clang/test/Analysis/dtor-array.cpp
10–11

Okay, makes sense, thanks for the explanation.

martong added inline comments.Aug 4 2022, 3:04 AM
clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
211

Yeah, but then add also PendingInitLoopMap.

isuckatcs marked 5 inline comments as done.
isuckatcs added inline comments.
clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
211

I dumped both of these traits to the egraph in a different patch, D131187.

NoQ added inline comments.Aug 4 2022, 11:40 AM
clang/lib/StaticAnalyzer/Core/ExprEngineCallAndReturn.cpp
294

assert(!isa<loc::MemRegionVal>(LV) && "Use invalidateRegion instead.");

Ok this definitely makes no sense. There are no other keys in the Store, all of them are regions. Looks like some ancient archeological artifact, maybe it made sense in 2012 but definitely not today.

This API has exactly one call site and it's broken for the above reason.

I think we should throw away the assertion, maybe turn it as a documentation remark.

isuckatcs marked an inline comment as done.Aug 4 2022, 1:00 PM
isuckatcs added inline comments.
clang/lib/StaticAnalyzer/Core/ExprEngineCallAndReturn.cpp
294

This source file was created in rL137677, and both functions existed in that already. ProgramState::killBinding() was called ProgramState::unbindLoc(), so it seems like we've been having this assertion since the beginning.

I can remove it and see what happens. If it will cause issues in the future, someone can still bring it back.

isuckatcs marked an inline comment as done.Aug 4 2022, 1:01 PM
isuckatcs updated this revision to Diff 450159.Aug 4 2022, 4:04 PM

Removed the assertion from ProgramState::killBinding().

In ExprEngine::processCallExit() I currently remove the this region
even if there is a LastSt just to see how the symbolReaper reacts.

If it crashes in some situation I think it's better to see that crash
immediately instead of waiting for someone to write a patch
that might lead to the same result but will be harder to debug.

xazax.hun added inline comments.Aug 8 2022, 9:41 AM
clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h
124–126

We are using a single unsigned value for array construction but have these additional values (size, shouldInline) for array destruction. I think this asymmetry can surprise people. I think it might be worth to have a brief comment why we need these for destruction but not for construction.

clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
209

While we classically used typedef to introduce these aliases, I wonder whether we want to switch to usings at some point. Feel free to ignore for this patch.

NoQ accepted this revision.Aug 8 2022, 12:20 PM

Aha looks great! I agree with Gabor's request for comment but in general this is good to go. Some real-world code testing is advised!

This revision is now accepted and ready to land.Aug 8 2022, 12:20 PM
NoQ added a comment.Aug 8 2022, 2:31 PM

Hold on, I think I found a crash.

isuckatcs added a comment.EditedAug 8 2022, 2:34 PM

I've also found one and already working on the fix.

Basically if the record contains 2 arrays with the same size and record type, after the evaluation of the first one a loop gets created in the egraph. As a result in Pred = Bldr.generateNode(EP, State, Pred); Pred will be a nullptr and the analyzer crashes later.

isuckatcs updated this revision to Diff 450980.Aug 8 2022, 3:43 PM
isuckatcs marked 2 inline comments as done.
  • Addressed comments
  • Fixed the issue that was causing a crash
  • Removed PendingArrayDestructionData

This current solution seems to be a bit fragile I'll address that later. Also it turned out temporary destructors also need to be handled. There was no FIXME in ExprEngine::ProcessTemporaryDtor() about arrays so I thought that function works fine.

clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h
124–126

This struct was necessary before the DynamicExtent solution. Now it's no longer necessary, so I removed it.

NoQ requested changes to this revision.Aug 8 2022, 5:16 PM
NoQ added inline comments.
clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
1249

Yes, there it is:

// clang -std=c++2a --analyze repro.cc
namespace std {
template <class _Tp> class unique_ptr {
  typedef _Tp *pointer;
  pointer __ptr_;

public:
  unique_ptr(pointer __p) : __ptr_(__p) {}
  ~unique_ptr() { reset(); }
  pointer get() {}
  void reset() {}
};
}

struct S;

S *makeS();
int bar(S *x, S *y);

void foo() {
  std::unique_ptr<S> x(makeS()), y(makeS());
  bar(x.get(), y.get());
}
0  clang-15                 0x0000000105f09f9c llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) + 72
1  clang-15                 0x0000000105f0a4f8 PrintStackTraceSignalHandler(void*) + 28
2  clang-15                 0x0000000105f084cc llvm::sys::RunSignalHandlers() + 148
3  clang-15                 0x0000000105f0978c llvm::sys::CleanupOnSignal(unsigned long) + 112
4  clang-15                 0x0000000105d1f69c (anonymous namespace)::CrashRecoveryContextImpl::HandleCrash(int, unsigned long) + 260
5  clang-15                 0x0000000105d1fdbc CrashRecoverySignalHandler(int) + 228
6  libsystem_platform.dylib 0x00000001a31492a4 _sigtramp + 56
7  clang-15                 0x000000010bbbd7c8 clang::ento::ExplodedNode::getLocationContext() const + 32
8  clang-15                 0x000000010bc56328 clang::ento::ExprEngine::VisitCXXDestructor(clang::QualType, clang::ento::MemRegion const*, clang::Stmt const*, bool, clang::ento::ExplodedNode*, clang::ento::ExplodedNodeSet&, clang::ento::EvalCallOptions&) + 168
9  clang-15                 0x000000010bc08174 clang::ento::ExprEngine::ProcessAutomaticObjDtor(clang::CFGAutomaticObjDtor, clang::ento::ExplodedNode*, clang::ento::ExplodedNodeSet&) + 852
10 clang-15                 0x000000010bc03c94 clang::ento::ExprEngine::ProcessImplicitDtor(clang::CFGImplicitDtor, clang::ento::ExplodedNode*) + 168
11 clang-15                 0x000000010bc02ee0 clang::ento::ExprEngine::processCFGElement(clang::CFGElement, clang::ento::ExplodedNode*, unsigned int, clang::ento::NodeBuilderContext*) + 348
12 clang-15                 0x000000010bbbd554 clang::ento::CoreEngine::HandlePostStmt(clang::CFGBlock const*, unsigned int, clang::ento::ExplodedNode*) + 280
13 clang-15                 0x000000010bbbc73c clang::ento::CoreEngine::dispatchWorkItem(clang::ento::ExplodedNode*, clang::ProgramPoint, clang::ento::WorkListUnit const&) + 720
14 clang-15                 0x000000010bbbc3a4 clang::ento::CoreEngine::ExecuteWorkList(clang::LocationContext const*, unsigned int, llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>)::$_0::operator()(unsigned int) const + 284
15 clang-15                 0x000000010bbbbd74 clang::ento::CoreEngine::ExecuteWorkList(clang::LocationContext const*, unsigned int, llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>) + 828
16 clang-15                 0x0000000109832738 clang::ento::ExprEngine::ExecuteWorkList(clang::LocationContext const*, unsigned int) + 88
17 clang-15                 0x0000000109831b00 (anonymous namespace)::AnalysisConsumer::RunPathSensitiveChecks(clang::Decl*, clang::ento::ExprEngine::InliningModes, llvm::DenseSet<clang::Decl const*, llvm::DenseMapInfo<clang::Decl const*, void>>*) + 308
18 clang-15                 0x00000001098315fc (anonymous namespace)::AnalysisConsumer::HandleCode(clang::Decl*, unsigned int, clang::ento::ExprEngine::InliningModes, llvm::DenseSet<clang::Decl const*, llvm::DenseMapInfo<clang::Decl const*, void>>*) + 580
19 clang-15                 0x00000001097ab82c (anonymous namespace)::AnalysisConsumer::HandleDeclsCallGraph(unsigned int) + 492
20 clang-15                 0x00000001097aa114 (anonymous namespace)::AnalysisConsumer::runAnalysisOnTranslationUnit(clang::ASTContext&) + 396
21 clang-15                 0x000000010979bc68 (anonymous namespace)::AnalysisConsumer::HandleTranslationUnit(clang::ASTContext&) + 432
22 clang-15                 0x0000000107765310 clang::MultiplexConsumer::HandleTranslationUnit(clang::ASTContext&) + 116
23 clang-15                 0x000000010bf45c4c clang::ParseAST(clang::Sema&, bool, bool) + 840
24 clang-15                 0x000000010770c6ec clang::ASTFrontendAction::ExecuteAction() + 296
25 clang-15                 0x000000010770bdb0 clang::FrontendAction::Execute() + 124
26 clang-15                 0x00000001075e61b8 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) + 824
27 clang-15                 0x0000000107869a28 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) + 1096
28 clang-15                 0x0000000100e586b4 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) + 1148
29 clang-15                 0x0000000100e46948 ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&) + 304
30 clang-15                 0x00000001072b444c clang::driver::CC1Command::Execute(llvm::ArrayRef<llvm::Optional<llvm::StringRef>>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, bool*) const::$_1::operator()() const + 40
31 clang-15                 0x00000001072b4418 void llvm::function_ref<void ()>::callback_fn<clang::driver::CC1Command::Execute(llvm::ArrayRef<llvm::Optional<llvm::StringRef>>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, bool*) const::$_1>(long) + 24
32 clang-15                 0x0000000105d1f4f4 llvm::function_ref<void ()>::operator()() const + 32
33 clang-15                 0x0000000105d1f454 llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) + 272
34 clang-15                 0x00000001072add48 clang::driver::CC1Command::Execute(llvm::ArrayRef<llvm::Optional<llvm::StringRef>>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, bool*) const + 372
35 clang-15                 0x0000000107238990 clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&, bool) const + 692
36 clang-15                 0x0000000107238da8 clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl<std::__1::pair<int, clang::driver::Command const*>>&, bool) const + 160
37 clang-15                 0x0000000107259e54 clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::__1::pair<int, clang::driver::Command const*>>&) + 492
38 clang-15                 0x0000000100e45d2c driver_main(int, char**) + 3660
39 clang-15                 0x0000000100e447cc clang_main(int, char**) + 628
40 clang-15                 0x0000000100e824f4 main + 36
41 dyld                     0x0000000230306c14 start + 2372
clang-15: error: clang frontend command failed with exit code 139 (use -v to see invocation)

The crash happens because repeated destructors for x and y cause the same state to appear in the same program point, which leads to a loop in the ExplodedGraph. This, in turn, causes Bldr.generateNode(EP, state, Pred); to return a null pointer (indicating that the node already exists) which is immediately dereferenced in VisitCXXDestructor().

At a glance the root cause of the problem is the loop: there shouldn't be a loop in the graph on this code, the code is completely linear. Looks like EpsilonPoint EP(LCtx, nullptr); should have a more precise identity to discriminate between multiple destructors happening simultaneously, so that the nodes don't get merged. This might be another reason to think about a better program point for this case, to acknowledge the fact that it's a destructor. Maybe CFG's destructor element can be included in the identity hmmm.

This revision now requires changes to proceed.Aug 8 2022, 5:16 PM
NoQ added inline comments.Aug 8 2022, 5:28 PM
clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
1249

Oh, you already fixed something while I was reducing! Looks like I found the same problem in a different destructor kind, I can confirm that it's still crashing. Yes, PostImplicitCall seems appropriate. Still need some solution to discriminate between these destructors in this case.

isuckatcs added inline comments.Aug 8 2022, 5:29 PM
clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
1249

Nice. In case of member destructors a crash occured for the same reason. I thought you also found that.

There the solution was to use a`PreImplicitCall` program point with a different location each time.

Opposed to constructors where we have a different statement with a unique ID for each call, with destructors we only have the declaration and the location context which can be the same in a lot of cases. At the moment the best way I found to distinguish them is to tie them to a source location of a related field, value, etc.

NoQ added inline comments.Aug 8 2022, 5:40 PM
clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
1249

Damn you're typing faster than me.

I think these source locations aren't just part of the identity, they're also used to attach path notes somewhere. In this sense it's important to keep them at the bottom of the function. I think we should add another identity element to these implicit call points and fill it with these variables or memebers (which may or may not be a source location, probably a void * could suffice) (or we could put them into tags but that may be annoying when checkers want to add their own tags).

NoQ added inline comments.Aug 8 2022, 6:04 PM
clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
1249

Hmm we should probably simply put a CFGElementRef in there.

isuckatcs updated this revision to Diff 451128.Aug 9 2022, 5:59 AM
isuckatcs marked 4 inline comments as done.

Fixed the issue that caused the second crash.

isuckatcs added inline comments.Aug 9 2022, 6:00 AM
clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
1249

I switched back to EpsilonPoint and I add Pred as the Data1 instead of nullptr. This way ExprEngine will only create a loop if 2 program points have the same predecessor.

1463

What to do with this situation? Ignore for now?

isuckatcs updated this revision to Diff 451148.Aug 9 2022, 7:48 AM

Changed the logic of getElementCountOfArrayBeingDestructed()

NoQ added inline comments.Aug 9 2022, 6:02 PM
clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
1249

Aha this sounds like cheating but it probably works well given that we're about to diverge immediately anyway. As long as there aren't any loops on tests, I'm happy with that. I'll try to re-run.

1463

As per our offline discussion, it looks like the absence of construction context is the problem here, as it causes an otherwise contiguous array to be split into two separate temporary memory regions. The correct solution is to have them both in the same temporary region and have an array destructor handle that normally. But we don't have to address this immediately.

isuckatcs updated this revision to Diff 451891.Aug 11 2022, 9:56 AM
isuckatcs marked 2 inline comments as done.
  • Handling 0 length arrays
  • Partially handling multidimensional arrays (doesn't work properly for delete and member dtors)
isuckatcs updated this revision to Diff 452035.Aug 11 2022, 4:40 PM

Handling multidimensional delete dtors properly. It turned out the constructor wasn't modelling the type of the region correctly.

Multidimensional member dtors are still an issue. Basically if an object has a multidimensional array member, only the destructor of the object is called. On the other hand if the member array is single dimensional, the dtors are called for every element.

isuckatcs updated this revision to Diff 452041.Aug 11 2022, 5:28 PM

Handling multidimensional member dtors too.

In the CFG we only added member dtors if the array was single dimensional. It happened accidentally back in rL117252.
The other parts of the patch seem to be correct, so nothing else needs to be changed.

Currently every test is passing.

NoQ added a comment.Aug 13 2022, 11:52 PM

Oh damn, I too forgot that they could be multi-dimensional. Nice!!

clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
1369

ElementCount isn't passed to prepareStateForArrayDestruction this time, so the assert is never checked.

isuckatcs updated this revision to Diff 452614.Aug 15 2022, 2:31 AM
isuckatcs marked an inline comment as done.
  • Addressed comment
  • Generating a "zero length array skip" node in all cases
NoQ added inline comments.Aug 15 2022, 5:05 PM
clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
1228

Maybe generate a sink in this case? We got ourselves into an inconsistent state, losing coverage is much preferable to obtaining weird results.

isuckatcs updated this revision to Diff 453813.Aug 18 2022, 3:24 PM
isuckatcs marked an inline comment as done.

Addressed comment

NoQ added a comment.Aug 18 2022, 5:51 PM

Ok one more crash!

This one causes infinite recursion in CoreEngine::dispatchWorkItem() where it handles EpsilonPoint which loops back to itself??? So it can manifest either as a segfault or as a hang depending on whether you're compiling with tail recursion.

struct a {
  a *b;
};
struct c {
  a d;
  c(int);
  ~c() {
    for (a e = d;; e = *e.b)
      ;
  }
};
void f() { c g(0); }

Ok one more crash!

This one causes infinite recursion in CoreEngine::dispatchWorkItem() where it handles EpsilonPoint which loops back to itself??? So it can manifest either as a segfault or as a hang depending on whether you're compiling with tail recursion.

struct a {
  a *b;
};
struct c {
  a d;
  c(int);
  ~c() {
    for (a e = d;; e = *e.b)
      ;
  }
};
void f() { c g(0); }

It seems the source of the issue is the constructor that's declared but not defined. We loop between the blocks of e = *e.b and for (... ;; ...). I changed EpsilonPoint to PreImplicitCall and that allow me look at the egraph. For g we create a conjured symbol, so we shouldn't inline the destructor, but we do and in the end we reach Block count exceeded. This problem was present even before this patch, though the crash was indeed caused by my EpsilonPoints.

The egraph looks like this before this patch:

Basically we exceeded the count and retried without inlining so we generate the EpsilonPoint in ExprEngine::replayWithoutInlining(). This epsilon point turnes out to be the same that I generated in the destructor processing functions, hence the loop.

Now that I changed EpsilonPoint to PreImplicitCall, the resulting egraph looks like this:

isuckatcs updated this revision to Diff 454003.Aug 19 2022, 8:06 AM
  • EpsilonPoint -> PreImplicitCall
  • Added test for multidimensional zero length array
isuckatcs updated this revision to Diff 454008.Aug 19 2022, 8:11 AM

Added a testcase for the crashing snippet.

NoQ added a comment.Aug 19 2022, 11:26 PM

One more segfault!

namespace std {
template <class _Tp> class unique_ptr {
  _Tp *__ptr_;
public:
  unique_ptr(_Tp *__p) : __ptr_(__p) {}
  ~unique_ptr() {}
};
} // namespace std

int SSL_use_certificate(int *arg) {
  std::unique_ptr<int> free_x509(arg);
  {
    if (SSL_use_certificate(arg)) {
      return 0;
    }
  }
  1;
}
0  clang-16                 0x0000000102f88ea0 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) + 56
1  clang-16                 0x0000000102f87ea0 llvm::sys::RunSignalHandlers() + 112
2  clang-16                 0x0000000102f8847c llvm::sys::CleanupOnSignal(unsigned long) + 232
3  clang-16                 0x0000000102eaf098 (anonymous namespace)::CrashRecoveryContextImpl::HandleCrash(int, unsigned long) + 208
4  clang-16                 0x0000000102eaf340 CrashRecoverySignalHandler(int) + 128
5  libsystem_platform.dylib 0x00000001a31492a4 _sigtramp + 56
6  clang-16                 0x0000000104c7cf50 clang::ento::ExprEngine::ProcessAutomaticObjDtor(clang::CFGAutomaticObjDtor, clang::ento::ExplodedNode*, clang::ento::ExplodedNodeSet&) + 900
7  clang-16                 0x0000000104c7cf50 clang::ento::ExprEngine::ProcessAutomaticObjDtor(clang::CFGAutomaticObjDtor, clang::ento::ExplodedNode*, clang::ento::ExplodedNodeSet&) + 900
8  clang-16                 0x0000000104c78f64 clang::ento::ExprEngine::ProcessImplicitDtor(clang::CFGImplicitDtor, clang::ento::ExplodedNode*) + 188
9  clang-16                 0x0000000104c780e8 clang::ento::ExprEngine::processCFGElement(clang::CFGElement, clang::ento::ExplodedNode*, unsigned int, clang::ento::NodeBuilderContext*) + 160
10 clang-16                 0x0000000104c60874 clang::ento::CoreEngine::HandlePostStmt(clang::CFGBlock const*, unsigned int, clang::ento::ExplodedNode*) + 140
11 clang-16                 0x0000000104c5f840 clang::ento::CoreEngine::ExecuteWorkList(clang::LocationContext const*, unsigned int, llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>) + 940
12 clang-16                 0x000000010444f094 (anonymous namespace)::AnalysisConsumer::HandleCode(clang::Decl*, unsigned int, clang::ento::ExprEngine::InliningModes, llvm::DenseSet<clang::Decl const*, llvm::DenseMapInfo<clang::Decl const*, void>>*) + 1612
13 clang-16                 0x0000000104436eb8 (anonymous namespace)::AnalysisConsumer::HandleTranslationUnit(clang::ASTContext&) + 1596
14 clang-16                 0x0000000103a2cd14 clang::MultiplexConsumer::HandleTranslationUnit(clang::ASTContext&) + 52
15 clang-16                 0x0000000104d85100 clang::ParseAST(clang::Sema&, bool, bool) + 744
16 clang-16                 0x00000001039e6e68 clang::FrontendAction::Execute() + 112
17 clang-16                 0x0000000103970f48 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) + 840
18 clang-16                 0x0000000103a69048 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) + 936
19 clang-16                 0x0000000100d63318 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) + 2448
20 clang-16                 0x0000000100d5e5f4 ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&) + 1028
21 clang-16                 0x0000000103800644 void llvm::function_ref<void ()>::callback_fn<clang::driver::CC1Command::Execute(llvm::ArrayRef<llvm::Optional<llvm::StringRef>>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, bool*) const::$_1>(long) + 28
22 clang-16                 0x0000000102eaeeb0 llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) + 248
23 clang-16                 0x0000000103800088 clang::driver::CC1Command::Execute(llvm::ArrayRef<llvm::Optional<llvm::StringRef>>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, bool*) const + 224
24 clang-16                 0x00000001037be630 clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&, bool) const + 504
25 clang-16                 0x00000001037be988 clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl<std::__1::pair<int, clang::driver::Command const*>>&, bool) const + 120
26 clang-16                 0x00000001037e8454 clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::__1::pair<int, clang::driver::Command const*>>&) + 340
27 clang-16                 0x0000000100d5dbb8 driver_main(int, char**) + 10016
28 clang-16                 0x0000000100d5adac clang_main(int, char**) + 696
29 dyld                     0x0000000230306c14 start + 2372
isuckatcs updated this revision to Diff 454229.Aug 20 2022, 9:33 AM

Handling the case when we cache out a node.

isuckatcs updated this revision to Diff 454245.Aug 20 2022, 1:05 PM

Handling the case when DynamicExtent returns a non-integer array size.

isuckatcs added inline comments.Aug 20 2022, 1:12 PM
clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
1131

I've seen the value returned by getDynamicElementCount(State, Region, svalBuilder, Ty) to be
(extent_$1604{SymRegion{conj_$1602{typename std::remove_const<class UniqueArrayElement *>::type, LC32114, S51835792, #1}}}) / 112.

isuckatcs updated this revision to Diff 454313.Aug 21 2022, 8:28 AM
  • Fixed another error prone function
  • Added a new test case
NoQ accepted this revision.Aug 23 2022, 11:32 AM

Ok there's no more crashes!!

Let's try to commit 🤞🤞🤞

This revision is now accepted and ready to land.Aug 23 2022, 11:32 AM
This revision was automatically updated to reflect the committed changes.