Page MenuHomePhabricator

Modifying PthreadLockChecker.cpp to reduce false positives.
ClosedPublic

Authored by malhar1995 on Apr 24 2017, 12:13 PM.

Details

Summary

I am currently working on to avoid false positives which currently occur as the return values of mutex functions like pthread_mutex_destroy() are not taken into consideration.

The precise description of the bug can be found here: https://bugs.llvm.org/show_bug.cgi?id=32455

Dr. Devin and Dr. Artem have been guiding me to fix PthreadLockChecker to avoid such false positives. The patch I'm attaching is not 100% correct and hence I need your advice to proceed further.

Thank you.

Diff Detail

Repository
rL LLVM

Event Timeline

malhar1995 created this revision.Apr 24 2017, 12:13 PM
NoQ edited reviewers, added: NoQ; removed: dergachev.a.May 3 2017, 4:28 AM
NoQ added a subscriber: NoQ.

Uhm, i need to do something about this duplicate account. Sorry, I have completely forgotten that the review is already up...

NoQ edited edge metadata.May 3 2017, 5:17 AM

Thanks for uploading this to phabricator and sorry again that i was lost for a while.

As we already discussed in the mailing lists, i agree with your point that the locked-and-possibly-destroyed state should be removed, and we also had a thought of making it explicit that the patch only applies to posix thread semantics, not to XNU semantics - in other words, there should appear different branches of code depending on the lock's semantics.

Could you also upload the patch file with the "context" (eg. git diff -U999999 would include +/- 999999 unchanged lines around the changes in the patch file, and phabricator would use them fancily to make reviewing easier; otherwise i don't immediately see which callback you are changing in a particular spot).

Once the patch would reach completion, it'd need to be cleaned up for formatting/whitespace; we usually run clang-format over our changes with default settings and it does everything for us.

lib/StaticAnalyzer/Checkers/PthreadLockChecker.cpp
21 ↗(On Diff #96439)

You don't need to include <iostream even for your own debugging, because we have this other facility:

llvm::errs() << "prpr " << RetVal << '\n';

(yeah, you can put SVal and some other objects into llvm::errs() directly (which would be equivalent to .dump()ing them), which is also handy).

145–151 ↗(On Diff #96439)

This code gets duplicated multiple times - including the checkDeadSymbols callback as well; can we refactor it into a function?

154 ↗(On Diff #96439)

I'd like you to consider various corner cases. Note that because we have the REGISTER_MAP_WITH_PROGRAMSTATE privately in this file, only code in this file can affect the contents of that program state trait. So we have complete knowledge of what can and cannot be in the program state.

For example, can it be that there's a symbol in the DestroyRetVal map, but lstate is not present for the same mutex region in the LockMap? Or does the code ensure that the former implies the latter? If we are sure that some invariants hold, then we should remove the respective if () and ideally replace it with an assert().

If you find invariants that always hold, it would be great to write these down in the comments inside the code.

Thank you for the patch! Could you please re-submit the patch with context? Instructions on how to do that can be found here:
http://llvm.org/docs/Phabricator.html

Added context.
Also, I removed the inclusion of iostream and also added the repetitive code to the function setAppropriateLockState.
Currently working on finding various corner cases and invariants.

malhar1995 updated this revision to Diff 99388.May 17 2017, 8:38 PM
malhar1995 marked an inline comment as done.

Cleaned up the previous patch.
Added checking of LockState before initializing a mutex as well.
Added separate branches of execution for PthreadSemantics and XNUSemantics.
Added assert in case of checkDeadSymbols as existence in DestroyRetVal ensures existence in LockMap.

NoQ added a comment.May 20 2017, 5:30 AM

Thanks! Your code looks very clear now, and it seems correct to me.

One last thing we definitely should do here would be add regression tests for the new functionality. I guess you already have your tests, otherwise you wouldn't know if your code works, so you'd just need to append them to the patch, somewhere at the bottom of test/Analysis/pthreadlock.c, and make sure that make -j4 check-clang-analysis passes. Ideally, we should cover as many branches as possible.

A few ideas of what to test (you might have thought about most of them already, and i expect them to actually work by just looking at what your code accomplishes):

  • What we can/cannot do with the mutex in the failed-to-be-destroyed state, depending on the state of the mutex before destruction was attempted.
  • What we can/cannot do with the mutex in each of the "Schrodinger" states - in particular, do we display the double-destroy warning in such cases?
  • How return-symbol death races against resolving success of the destroy operation: what if the programmer first tries to destroy mutex, then uses the mutex, then checks the return value?
  • Are you sure we cannot assert(lstate) on line 137? - a test could be added that would cause such assertion to fail if someone tries to impose it.

Apart from that, i guess it'd be good to use more informative variable names in a few places (see inline comments), and fix the formatting, i.e. spaces and line breaks (should be easy with clang-format). Also you shouldn't add the .DS_Store files :) And then we'd accept and commit this patch.

lib/StaticAnalyzer/Checkers/PthreadLockChecker.cpp
28 ↗(On Diff #99388)

I still think these names, no matter if a good metaphor or not and no matter how much i enjoyed them, should be toned down :) Suggesting UntouchedAndPossiblyDestroyed and UnlockedAndPossiblyDestroyed.

77 ↗(On Diff #99388)

I suggest renaming to something like "resolvePossiblyDestroyedMutex()".

Also, i'm for passing the symbol by value (with * dereference at most call sites) because it's less surprising/confusing to the reader.

I also suggest a comment explaining what the function does. Eg., "When a lock is destroyed, in some semantics we are not sure if the destroy call has succeeded or failed, and the lock enters one of the 'possibly destroyed' state. There is a short time frame for the programmer to check the return value to see if the lock was successfully destroyed. Before we model the next operation over that lock, we call this function to see if the return value was checked by now and set the lock state - either to destroyed state or back to its previous state."

81–85 ↗(On Diff #99388)

Because there's only one comment per three traits, it'd be great to clean this up a bit together with commenting up your new trait:

// A stack of locks for tracking lock-unlock order.
REGISTER_LIST_WITH_PROGRAMSTATE(LockSet, const MemRegion *)

// An entry for tracking lock states.
REGISTER_MAP_WITH_PROGRAMSTATE(LockMap, const MemRegion *, LockState)

// Return values for unresolved destroy calls.
REGISTER_MAP_WITH_PROGRAMSTATE(DestroyRetVal, const MemRegion *, SymbolRef)
143–146 ↗(On Diff #99388)

I think we can be certain that the lock is in one of these states, and assert that.

149–151 ↗(On Diff #99388)

Assert the lock state here as well?

302–308 ↗(On Diff #99388)

A bit shorter and more stylish:

SymbolRef RetSym = C.getSVal(CE).getAsSymbol();
if (!RetSym)
  return;

Also, we may want to see what happens when the value is not a symbol. This would be surprising, but it may happen if the body of pthread_mutex_destroy() is suddenly available in our translation unit and we modeled the call, or another checker has assisted with modeling it. In this case the return value may be a concrete integer (say 0 or 1) or an UnknownVal or an UndefinedVal. It may also be something completely weird such as a pointer or a structure if the user defines his own function that is called "pthread_mutex_destroy()" but does something completely different.

I suggest to stop tracking the lock region in case RetSym is null (remove LockR from the LockMap), because in the really weird case when this code actually gets triggered, we'd at least not say that the mutex is still locked later on the path.

Additionally, by adding the transition before that return, we ensure that setAppropriateLockState above has taken effect.

399 ↗(On Diff #99388)

Could you add here a

// TODO: Clean LockMap when a mutex region dies.

That's not something you should instantly do, but it's definitely something that needs to be done for this checker some day.

malhar1995 marked 2 inline comments as done.May 20 2017, 5:54 AM
In D32449#760141, @NoQ wrote:

Thanks! Your code looks very clear now, and it seems correct to me.

One last thing we definitely should do here would be add regression tests for the new functionality. I guess you already have your tests, otherwise you wouldn't know if your code works, so you'd just need to append them to the patch, somewhere at the bottom of test/Analysis/pthreadlock.c, and make sure that make -j4 check-clang-analysis passes. Ideally, we should cover as many branches as possible.

A few ideas of what to test (you might have thought about most of them already, and i expect them to actually work by just looking at what your code accomplishes):

  • What we can/cannot do with the mutex in the failed-to-be-destroyed state, depending on the state of the mutex before destruction was attempted.
  • What we can/cannot do with the mutex in each of the "Schrodinger" states - in particular, do we display the double-destroy warning in such cases?
  • How return-symbol death races against resolving success of the destroy operation: what if the programmer first tries to destroy mutex, then uses the mutex, then checks the return value?
  • Are you sure we cannot assert(lstate) on line 137? - a test could be added that would cause such assertion to fail if someone tries to impose it.

    Apart from that, i guess it'd be good to use more informative variable names in a few places (see inline comments), and fix the formatting, i.e. spaces and line breaks (should be easy with clang-format). Also you shouldn't add the .DS_Store files :) And then we'd accept and commit this patch.

Dear Dr. Artem,

Thank you so much for such a detailed review. I'll work on addressing these comments ASAP and reach out to you in case I have any queries.

Regards,
Malhar Thakkar

malhar1995 added inline comments.May 20 2017, 9:41 AM
lib/StaticAnalyzer/Checkers/PthreadLockChecker.cpp
143–146 ↗(On Diff #99388)

We can be certain that the lock state will be either of the two only if I add the following statement before returning from this function.

state = state->remove<DestroyRetVal>(lockR);

If I don't add the above statement, a return value symbol for the region specified by lockR will still be in DestroyRetVal and it may have an actual lock state (locked, unlocked or destroyed).

malhar1995 updated this revision to Diff 99750.May 22 2017, 6:10 AM

Addressed previous comments (removed Schrodinger from lock state names, changed method name setAppropriateLockState to resolvePossiblyDestroyedMutex, added an assert in resolvePossiblyDestroyedMutex, formatted the code using clang-format and added some test-cases to test/Analysis/pthreadlock.c)

malhar1995 marked 6 inline comments as done.May 22 2017, 8:13 AM
NoQ added a comment.May 22 2017, 12:14 PM

Thanks, this is great! Two more things:

  • You have touched other code, unrelated to your patch, with clang-format; we're usually trying to avoid that, because it creates merge conflicts out of nowhere, and because some of that code actually seems formatted by hand intentionally. It's better to revert these changes; you can use the git clang-format thing to format only actual changes.
  • Updating .gitignore sounds like the right thing to do (llvm's .gitignore already has this), but i guess we'd better make a separate commit for that.
lib/StaticAnalyzer/Checkers/PthreadLockChecker.cpp
143–146 ↗(On Diff #99388)

Yep, that's a great thing to do. I didn't notice this.

Generally, it's great to keep the program state free from stuff that would no longer be necessary.

In D32449#761303, @NoQ wrote:

Thanks, this is great! Two more things:

  • You have touched other code, unrelated to your patch, with clang-format; we're usually trying to avoid that, because it creates merge conflicts out of nowhere, and because some of that code actually seems formatted by hand intentionally. It's better to revert these changes; you can use the git clang-format thing to format only actual changes.

I did not apply clang-format to any file except for PthreadLockChecker.cpp. Do you think the merge conflict is due to me not applying clang-format to test/Analysis/pthreadlock.c? The only files I changed were .gitignore, PthreadLockChecker.cpp and test/Analysis/pthreadlock.c.
Also, when you asked me to revert the changes, did you mean revert the changes made by clang-format? If yes, how do I do that?
I apologize for asking such silly questions. The thing is I'm new to all this and I don't really know how to proceed further.

  • Updating .gitignore sounds like the right thing to do (llvm's .gitignore already has this), but i guess we'd better make a separate commit for that.
NoQ added a comment.May 23 2017, 9:01 AM

No-no, all i was trying to say is that there's code in PthreadLockChecker.cpp that you haven't changed, but accidentally reformatted - and this is something we normally try to avoid. Like, for example, changing enum LockingSemantics {...} from vertical to horizontal - that wasn't your intention, it just accidentally happened because you auto-reformatted the whole file. I don't mind these changes, and i didn't mean they introduce any merge conflicts for now, though they tend to do so in the future for other people working on the same code, as we have a few downstream users, and magenta guys who are working on this checker as part of D26342), so most of the time it's better not to introduce unnecessary changes.

I'm not sure if you can easily revert the style-only changes via git clang-format, but you should be able to find them with the help of git reflog if you made a local commit before running clang-format (even if it was later discarded or amended), otherwise only manually i guess.

Applied clang-format only to the changed lines in the final code.

NoQ accepted this revision.May 29 2017, 5:25 AM

I'd commit your patch without the .gitignore change, as it deserves a separate commit and more attention; will have a look at it myself - llvm's and clang's .gitignores have diverged quite a bit.

Thanks for taking this up!~

This revision is now accepted and ready to land.May 29 2017, 5:25 AM
This revision was automatically updated to reflect the committed changes.