This is an archive of the discontinued LLVM Phabricator instance.

[analyzer] Diagnose stack leaks via block captures
ClosedPublic

Authored by alexander-shaposhnikov on Oct 30 2017, 3:43 PM.

Details

Summary

This diff extends StackAddrEscapeChecker to catch stack addr leaks via block captures if the block is executed
asynchronously.
Test plan: make check-all

Diff Detail

Repository
rL LLVM

Event Timeline

xazax.hun added inline comments.Oct 31 2017, 3:44 AM
lib/StaticAnalyzer/Checkers/StackAddrEscapeChecker.cpp
145 ↗(On Diff #120908)

This will stop the analysis on this execution path. Is this desired? Usually, we stop the execution when there is no way to model the program state after the error, e.g.: after a division by zero. In this case the stack address escaped but it wasn't dereferenced (yet), so I think it might be safe to continue the analysis on this path. What do you think?

156 ↗(On Diff #120908)

The variable should start with an uppercase letter.

NoQ added inline comments.Oct 31 2017, 8:12 AM
lib/StaticAnalyzer/Checkers/StackAddrEscapeChecker.cpp
128 ↗(On Diff #120908)

dispatch_once isn't really asynchronous; i think it's out of place here.

Also we model it in the body farm, so i guess any errors inside its block would be caught anyway.

136–139 ↗(On Diff #120908)

If you like, feel free to add support for for (const auto &Item: Block->referenced_vars()) loops into BlockDataRegion :)

144 ↗(On Diff #120908)

getMemorySpace() is never null. Every region resides in a memory space.

153 ↗(On Diff #120908)

Not sure - the rest of the checker's code seems to be making use of the returned SourceRange by adding to the report, do we need that here?

test/Analysis/stack-async-leak.m
1 ↗(On Diff #120908)

We don't add -analyzer-store=region nowadays; i don't think it's useful (though it doesn't hurt).

60 ↗(On Diff #120908)

You may enjoy adding an extra note piece (as in D24278) to the offending statement within the block. It might be helpful if it's not obvious which block is being executed or which pointer points to the leaking variable.

Also it might be a good idea to think about a visitor that would track the block pointer to highlight where it was initialized with a concrete block code.

Though probably it's not super useful because we're most likely staying within a single function with this checker.

By the way, do we warn (or want to warn) on returning a block from a function, when this block captures a local variable?

NoQ edited edge metadata.Oct 31 2017, 8:13 AM

What i was trying to say is - Hey this check looks useful!~ Great idea.

lib/StaticAnalyzer/Checkers/StackAddrEscapeChecker.cpp
144 ↗(On Diff #120908)

i was looking at StackAddrEscapeChecker::checkPreStmt - should it be fixed there as well, or that one is different ?

145 ↗(On Diff #120908)

I think you are right, would be better to make it non-fatal

Address the comments, add more tests

alexander-shaposhnikov marked 5 inline comments as done.Oct 31 2017, 2:13 PM
test/Analysis/stack-async-leak.m
60 ↗(On Diff #120908)

regarding tracking - yeah, i agree, that would be good, although i would prefer to make progress incrementally and add this part later.

By the way, do we warn (or want to warn) on returning a block from a function, when this >block captures a local variable?

good question - yeah, now we do

dcoughlin edited edge metadata.

I think this is a great idea, and I expect it to find some nasty bugs!

However, I'm worried about false positives in the following not-too-uncommon idiom:

dispatch_semaphore_t semaphore = dispatch_semaphore_create(0);
dispatch_async(some_queue, ^{

  // Do some work

  dispatch_semaphore_signal(semaphore);
});

// Do some other work concurrently with the asynchronous work

// Wait for the asynchronous work to finish
dispatch_semaphore_wait(semaphore, DISPATCH_TIME_FOREVER);

What do you think about suppressing the diagnostic when the block captures a variable with type 'dispatch_semaphore_t'? This isn't a perfect suppression, but it will catch most of the cases that I have seen.

Also, does this checker diagnose when a block captures a C++ reference? If so, it would be great to add a test for that!

Some additional comments inline.

lib/StaticAnalyzer/Checkers/StackAddrEscapeChecker.cpp
124 ↗(On Diff #121100)

Could we break with tradition and add a oxygen comment describing what this method does?

133 ↗(On Diff #121100)

Can you factor this logic out with the logic it duplicates in checkPreStmt()?

136 ↗(On Diff #121100)

Stylistically we use isa<> for cases like these where the result of dyn_cast<> is not used.

146 ↗(On Diff #121100)

Instead of "might leak ... " I would suggest " is captured by an asynchronously-executed block". In the context of the analyzer we use "leak" to mean "resource that is not freed" rather than in the sense of exposing information.

You'll need to make a different variant of the diagnostic text for when the block is returned rather than asynchronously executed.

185 ↗(On Diff #121100)

Will this have a false positive if I create a block in a caller, pass it to a callee, and then the callee returns it? You may want to factor out and reuse the logic from below that compares the StackFrameContext for the current frame to the frame of the captured addresses in checkBlockCaptures().

test/Analysis/stack-async-leak.m
1 ↗(On Diff #121100)

Can you add an additional run line testing the expected behavior when ARC is not enabled? You can pass a -DARC_ENABLED=1 or similar to distinguish between the two (for example, to test to make sure that a diagnostic is emitted under MRR but not ARC).

alexander-shaposhnikov planned changes to this revision.Oct 31 2017, 9:58 PM

Refactor the checker, add more tests.

test/Analysis/stack-async-leak.m
1 ↗(On Diff #121100)

i've tried this, it somehow gets more complicated to read / change the tests - decided to create a separate file.

Add more tests for blocks being passed around

test/Analysis/stack-capture-leak-arc.mm
68 ↗(On Diff #121230)

c++ reference

Thanks!

Another round of comments inline. With those addressed it looks good to me -- but you should wait on Artem's go-ahead before committing.

lib/StaticAnalyzer/Checkers/StackAddrEscapeChecker.cpp
121 ↗(On Diff #121230)

I think here you will need to check whether Q is a TypedefType and compare the IdentifierInfo for its TypedefNameDecl to 'dispatch_semaphore_t' instead.

The reason is that NSObject<OS_dispatch_semaphore> is an implementation detail and in fact dispatch_semaphore_t is typedef'd to another type depending on whether the file is Objective-C(++) or not.

There is an example of the idiom we use to do IdentifierInfo comparisons in ObjCDeallocChecker. We cache the identifier in the checker class. This lets us get constant-time comparisons on identifiers (since they are interned strings).

Also: It would be good to add a comment explaining why we care whether a semaphore was captured.

316 ↗(On Diff #121230)

We generally prefer to avoid combining style cleanups like these with functional changes except in the code that is required to be updated for the functional change. This makes it easier for future maintainer to use the commit history to understand what a commit did and why.

test/Analysis/stack-async-leak.m
1 ↗(On Diff #121100)

That's fine!

test/Analysis/stack-capture-leak-arc.mm
144 ↗(On Diff #121230)

Can you add a // no-warning comment to this line? We use that to indicate that we explicitly don't expect a warning there.

  1. Add no-warning comments
  2. Switch to using a cached IdentifierInfo (similar to how it's done in CheckObjCDealloc.cpp)
  3. Rerun the all the tests - they are fine.

@dcoughlin - many thanks for the code review,
yeah, i will wait for the feedback from NoQ@ as well.

NoQ added a comment.Nov 10 2017, 12:00 AM

Looks great, thanks! Minor inline stuff.

lib/StaticAnalyzer/Checkers/StackAddrEscapeChecker.cpp
116 ↗(On Diff #122194)

Because stack address escapes are not fatal errors, it is not necessarily true that any other stack frame context here would be an ancestor of the current context. It may be an escaped stack variable from a function that has exited long time ago. You might want to check the getParent() chain.

Hmm, or rather rename the function to "isNotInCurrentFrame" because that's how you seem to be using it.

173 ↗(On Diff #122194)

Typo: capatured -> captured.

177 ↗(On Diff #122194)

Hmm, why would such region even be in the list?

184 ↗(On Diff #122194)

I think bugtype should describe the bug, so that the user knew what sort of bug it is by looking at the list. Maybe add "...is captured" here?

Address the comments

alexander-shaposhnikov marked 3 inline comments as done.Nov 10 2017, 12:58 PM

I found some nits, but overall I think this is getting close.

lib/StaticAnalyzer/Checkers/StackAddrEscapeChecker.cpp
76 ↗(On Diff #122507)

In case you already change these lines you could replace the type name on the left side with auto to avoid repeating types.

123 ↗(On Diff #122507)

You can also use auto here to avoid mentioning the type twice.

192 ↗(On Diff #122507)

How long usually these error messages are? Maybe 512 is a bit large buffer for this? Note that in case the error message is longer this is still not a hard error, it just will hit the slow path (allocating the buffer on the heap).

214 ↗(On Diff #122507)

Maybe you wanted to mention "captured" here as well?

215 ↗(On Diff #122507)

Maybe the error reporting part could be abstracted out to avoid code duplication with the function above?

lib/StaticAnalyzer/Checkers/StackAddrEscapeChecker.cpp
192 ↗(On Diff #122507)

Note that in case the error message is longer this is still not a hard error, it just will hit the slow path

thanks, i'm aware of that.
512 was used in this file before my invasion,
the size obviously depends on what genName would generate (and inherently depends on the code style of the codebase)

adjust the messages, more uses of auto

NoQ accepted this revision.Nov 20 2017, 7:34 AM

Thanks, looks great, please commit!~

This revision is now accepted and ready to land.Nov 20 2017, 7:34 AM
This revision was automatically updated to reflect the committed changes.