This is an archive of the discontinued LLVM Phabricator instance.

[analyzer] Prevent crashing in NonNullParamChecker
ClosedPublic

Authored by george.karpenkov on Feb 28 2018, 8:00 PM.

Details

Summary

https://bugs.llvm.org/show_bug.cgi?id=36381
rdar://37543426

Turns out, the type passed for the lambda capture was incorrect.
One more argument to abandon the getSVal overload which does not require the type information.

Diff Detail

Repository
rL LLVM

Event Timeline

Hi George. The commit looks fine, but could you please add a test?

Updated in favor of setting the type inside RegionStore

NoQ accepted this revision.Mar 1 2018, 4:54 PM

Here's the way i understand what's going on. @xazax.hun, you'd probably like this one because it's about lambdas. Though in fact, of course, it's about RegionStore - lambda support doesn't seem to be at fault.

  1. f(i) on line 6 implicitly copies i in order to pass it into f().
  2. That implicit copy-constructor of class C accepts i as const C &.
  3. i is captured into the lambda by reference.
  4. It means that the lambda object contains a reference-type field that references the actual variable i.
  5. At least, it should have been containing that. We didn't actually implement that yet.
  6. This field has no name, but for the purposes of this discussion let's call it _i.
  7. We do try, however, to retrieve the value for _i while modeling DeclRefExpr i on line 6 - that's already implemented.
  8. The lambda is copied twice before we invoke it - first time to variable lambda, second time to *blah.
  9. performTrivialCopy makes a direct binding when copying the object, even though it doesn't explicitly intend to.
  10. I've no idea why the heap symregion matters, but having at least one copy seems important.
  11. Anyway, on line 6 _i is a FieldRegion within that HeapSymRegion.
  12. When we try to retrieve the value for field _i, we get into getBindingForFieldOrElementCommon().
  13. We immediately see that the whole HeapSymRegion has a direct lazy binding.
  14. We return this lazy binding as SVal for field _i, and, therefore, for DeclRefExpr i.
  15. This is our problem. i is an lvalue expression, so its defined value must not be a NonLoc.
  16. But instead we have a lazyCompoundVal for the whole HeapSymRegion.
  17. Once we specify our reference type explicitly, we also have to go through CastRetrievedVal() before returning from getBinding().
  18. CastRetrievedVal() fails to cast our lazyCompoundVal into a reference, so it returns UnknownVal.
  19. UnknownVal is not a NonLoc, which is acceptable.

The shocking part of this investigation is that for now even if region R is a typed-value region (FieldRegion in our case), specifying the optional type argument T in getSVal(R, T) is mandatory, otherwise the behavior may be completely incorrect. The current version of the patch tries to fix that. The underlying problem, however, remains: we should not be blindly returning a random binding on step 14, even if it's direct. And it is an open question for me whether performTrivialCopy should make a direct binding. Finally, i don't understand why everything else works and why this problem requires a super confusing reproducer with lambdas and heap allocations.

This revision is now accepted and ready to land.Mar 1 2018, 4:54 PM
This revision was automatically updated to reflect the committed changes.
This revision was automatically updated to reflect the committed changes.