[analyzer][UninitializedObjectChecker] Fixed dereferencing
ClosedPublic

Authored by Szelethus on Aug 21 2018, 12:41 PM.

Details

Summary

This patch aims to fix derefencing, which has been debated for months now.

Instead of working with SVals, the function now relies on TypedValueRegion.

Since this has been discussed since the inception of the checker, I'd very much prefer finding a permanent solution for this before moving forward.

Diff Detail

Repository
rL LLVM
Szelethus created this revision.Aug 21 2018, 12:41 PM

@NoQ, is this how you imagined derefercing to work? I'm still a little uncertain that I implemented this in a way that addresses all your concerns.

NoQ added a comment.Aug 21 2018, 1:21 PM

Yup, this looks great and that's exactly how i imagined it.

lib/StaticAnalyzer/Checkers/UninitializedObject/UninitializedObject.h
261–265 ↗(On Diff #161796)

We have a fancy static Loc::isLocType().

lib/StaticAnalyzer/Checkers/UninitializedObject/UninitializedPointee.cpp
126–127 ↗(On Diff #161796)

Good catch.

It might still be possible to initialize a reference with an already-undefined pointer if core checkers are turned off, but we don't support turning them off, so i guess it's fine.

177 ↗(On Diff #161796)

Just SVal. And you should be able to pass R directly into getSVal.

210 ↗(On Diff #161796)

We usually just do .getAsRegion().

And then later dyn_cast it.

212–216 ↗(On Diff #161796)

Ok, i have a new concern that i didn't think about before.

There's nothing special about symbolic regions. You can make a linked list entirely of concrete regions:

struct List {
  List *next;
  List(): next(this) {}
};

void foo() {
  List list;
}

In this case the finite-ness of the type system won't save us.

I guess we could also memoize base regions that we visit :/ This is guaranteed to terminate because all of our concrete regions are based either on AST nodes (eg. global variables) or on certain events that happened previously during analysis (eg. local variables or temporaries, as they depend on the stack frame which must have been previously entered). I don't immediately see a better solution.

223 ↗(On Diff #161796)

We might have eliminated symbolic regions but we may still have concrete function pointers (which are TypedRegions that aren't TypedValueRegions, it's pretty weird), and i guess we might even encounter an AllocaRegion (which is completely untyped). I guess we should not try to dereference them either.

240–244 ↗(On Diff #161796)

This code seems to be duplicated with the "0th iteration" before the loop. I guess you can put everything into the loop.

Szelethus updated this revision to Diff 162198.EditedAug 23 2018, 9:17 AM

Addressed the inline comments.

  • Added a llvm::SmallSet to further ensure the termination of the loop.
  • Changed isLocType to isDereferencableType, block pointers are regarded as primitive objects.
  • Added new tests.
  • No longer using getDynamicTypeInfo, dynamic type is acquired from TypedValueRegion::getLocationType.
Szelethus marked 5 inline comments as done.Aug 23 2018, 9:27 AM
Szelethus added inline comments.
lib/StaticAnalyzer/Checkers/UninitializedObject/UninitializedObject.h
261–265 ↗(On Diff #161796)

Oh, good to know! However, it also returns true for nullptr_t, which also happens to be a BuiltinType. I'd like to keep isPrimitiveType and (the now renamed) isDereferencableType categories disjunctive. Primitive types require no further analysis other then checking whether they are initialized or not, which is true for nullptr_t objects.

259 ↗(On Diff #162198)

I'm not sure this is correct -- do block pointers belong here? Since their region is not TypedValueRegion, I though they better fit here.

lib/StaticAnalyzer/Checkers/UninitializedObject/UninitializedPointee.cpp
240–244 ↗(On Diff #161796)

I moved some code into the loop, but I think that I really need a 0th iteration to make the code readable.

Szelethus added a comment.EditedAug 29 2018, 12:23 PM

I managed to cause a bunch of crashes with this patch, so consider it as a work in progress for now :/

Szelethus updated this revision to Diff 163185.Aug 29 2018, 1:57 PM

Fixed two crashes. Interesting how many much better this approach works, as I've never encountered FunctionTyped objects or only forward declared typed pointers after dereferencing.

Szelethus updated this revision to Diff 163499.Aug 31 2018, 4:24 AM

Reuploaded with -U99999. Oops.

Szelethus updated this revision to Diff 163992.EditedSep 5 2018, 2:44 AM

Fixed an assertation failure where a lambda field's this capture was a part of the fieldchain.

Szelethus added inline comments.Sep 5 2018, 2:53 AM
lib/StaticAnalyzer/Checkers/UninitializedObject/UninitializedPointee.cpp
126–127 ↗(On Diff #161796)

I removed it, because it did crash couple times on LLVM. Note that the assert checked whether the reference for undefined, not uninitialized :/.

It's no longer in the code, but this was it:

assert(!FR->getDecl()->getType()->isReferenceType() &&
       "References must be initialized!");
test/Analysis/cxx-uninitialized-object.cpp
879–902 ↗(On Diff #163992)

I'm 99% sure this is a FP, but it doesn't originate from the checker. Shouldn't *ptr be undef after the end of the code block as lambda's lifetime ends?

Nevertheless, it did cause a crash, so here's a quick fix for it.

NoQ added inline comments.Sep 6 2018, 6:18 PM
lib/StaticAnalyzer/Checkers/UninitializedObject/UninitializedPointee.cpp
126–127 ↗(On Diff #161796)

Hmm, indeed, it seems that we don't have a checker for garbage references, only for null references(?).

I guess it's a good idea for a checker.

test/Analysis/cxx-uninitialized-object.cpp
879–902 ↗(On Diff #163992)

I'm pretty sure that all sorts of contents of lambda aka *ptr are undefined once it goes out of scope. Moreover, ptr is now a dangling pointer, and reading from it would cause undefined behavior. I'm not sure if the analyzer actually models this though.

But on the other hand, even if it didn't go out of scope, i don't really see where field a was initialized here.

Soo what makes you think it's a false positive?

Szelethus added inline comments.Sep 7 2018, 3:56 AM
test/Analysis/cxx-uninitialized-object.cpp
879–902 ↗(On Diff #163992)

Soo what makes you think it's a false positive?

Poor choice of words I guess. Its not a false positive (as the entire region of that lambda is undefined), but rather a false negative, as the analyzer doesn't pick up that *ptr is a dangling pointer.

I ran the checker on LLVM/Clang and grpc, and saw no crashes at all! Quite confident about the current diff. However, I'm not 100% sure it doesn't break on Objective C++ codes. Can you recommend a project like that?

NoQ accepted this revision.Sep 13 2018, 6:56 PM

Looks good i guess. I'm glad this is sorted out^^

This revision is now accepted and ready to land.Sep 13 2018, 6:56 PM
This revision was automatically updated to reflect the committed changes.