This patch handles all the comparision methods (defined via overloaded
operators) on std::unique_ptr. These operators compare the underlying
pointers, which makes it difficult to model the result.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
The only method that I think can be realistically modelled is == (and thus !=). If both the operands refer to the same unique_ptr, we know == returns true. If they are not the same, the only way == can return true if the two smart pointers were initialized from the same raw pointer. This is of course a fatal bug in itself. So perhaps we can ignore this case and only consider the first case.
The ordering operators I guess can't be handled because there is no way to statically tell in general the address of some value. We have the following deductions, nevertheless, mathematically:
Let ptr1 and ptr2 be two std::unique_ptr objects.
If (ptr1 == ptr2) is true:
- ptr1 < ptr2 is false
- ptr1 > ptr2 is false
- ptr1 <= ptr2 is true
- ptr1 >= ptr2 is true
If (ptr1 == ptr2) is false, we can't say anything really.
Well, I think it depends. If one of the pointers is null, for some platforms, we can. E.g. null < non-null is probably true on most architectures, and similarly non-null < null is false. Also null <= anyptr is probably true on the platforms we care about.
If they are not the same, the only way == can return true if the two smart pointers were initialized from the same raw pointer. This is of course a fatal bug in itself.
Is it? E.g. two default constructed unique_ptrs both should have null as the underlying pointer, they should be considered equal, are not the same object, and this is not a fatal bug.
clang/lib/StaticAnalyzer/Checkers/SmartPtrModeling.cpp | ||
---|---|---|
343–345 | Another idea here. State = State->assume(llvm::dyn_cast<DefinedOrUnknownSVal>(RetVal), VALUE); addTransition = true; and having boolean addTransition, you can have Optional<bool> CompResult and do: CompResult = VALUE; And do the actual assumption at the bottom section when CompResult.hasValue() | |
367–386 | These are also symmetrical. | |
416–458 | These are symmetrical, is there a way to implement it without this boilerplate? |
Why not simply delegate this job to assume(evalBinOp(...)) over raw pointer values, which already has all this logic written down nicely?
Sorry about that! :( If we learned anything new in the process it was not wasted effort though.
I know what I learnt - if I am writing too much code, there is probably a function to that. :)
We have a failing test here (test at line 473).
Which makes me wonder if the handleComparision function is at all called. This is something I need to check.
clang/lib/StaticAnalyzer/Checkers/SmartPtrModeling.cpp | ||
---|---|---|
312 | So it looks like operator<< is the only operator we are not supporting. I think it might be good to include some test code to ensure that its use does not suppress warnings. (Also OK, if you decide to deal with this in a follow-up PR.) | |
345 | I think in cases where we know that the result is true or false, the RetVal should probably be a constant instead of a conjured symbol with an assumption. | |
352 | Instead of marking this unreachable, I'd suggest to just return a conjured symbol for now. Usually, we should not use asserts to test user input. | |
365 | I am not sure if we actually need all this special casing. You could create an SVal that represents a nullpointer constant and use evalBinOp with that SVal when there is no symbol available. | |
405 | cannot reach here is redundant information. That is already encoded in llvm_unreachable. I suggest including a message that tells the reader why is it unreachable. In this case it could be "unexpected overloaded operator kind". | |
clang/test/Analysis/smart-ptr.cpp | ||
466 | Putting tests like this on the same path can be risky. Tests might split the execution path (maybe not now, but in the future). I think it might be more future proof to have a large switch statement that switches on an unknown value and put the tests in separate cases. |
clang/lib/StaticAnalyzer/Checkers/SmartPtrModeling.cpp | ||
---|---|---|
312 | Yes that's the next patch. (and one more for std::hash). | |
352 | Ah yes that's what is happening now. RetVal is initialized to a conjured value. If we can conclude no further, then that is what is returned - which is what happens here. In other cases, constraints are applied or it is replaced by the output from evalBinOp. | |
365 | Actually the naming is a bit misleading here, perhaps. FirstPtr is not the inner pointer itself but a pointer to the SVal. So the test FirstPtr && !SecondPtr checks that we know the SVal for the inner pointer of the first arg but we do not know that of the second arg. Using a null-pointer constant would not help. |
clang/test/Analysis/smart-ptr.cpp | ||
---|---|---|
466 | I didn't quite get you. |
clang/lib/StaticAnalyzer/Checkers/SmartPtrModeling.cpp | ||
---|---|---|
336 | In case we conjure a new symbol, we want this stored in the TrackedRegionMap. So the analyzer can correctly record the constraints between the two symbols. | |
clang/test/Analysis/smart-ptr.cpp | ||
466 | You remember this in the other patch: member-constructor.cpp:15:5: warning: FALSE [debug.ExprInspection] clang_analyzer_eval(*P->p == 0); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ member-constructor.cpp:15:25: note: Assuming the condition is false clang_analyzer_eval(*P->p == 0); ^~~~~~~~~~ member-constructor.cpp:15:5: note: FALSE clang_analyzer_eval(*P->p == 0); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ member-constructor.cpp:15:5: warning: TRUE [debug.ExprInspection] clang_analyzer_eval(*P->p == 0); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ member-constructor.cpp:15:25: note: Assuming the condition is true clang_analyzer_eval(*P->p == 0); ^~~~~~~~~~ member-constructor.cpp:15:5: note: TRUE clang_analyzer_eval(*P->p == 0); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2 warnings generated. It looks like this does not happen for overloaded comparison operators at the moment. But we might want to do that in the future (@NoW what do you think). I was wondering, if we want to future proof these test cases for that behavior. But looking at the test cases again, you only have two, where the expected result is unknown, and they are at the very end. So feel free to ignore this and leave the code as is. |
clang/lib/StaticAnalyzer/Checkers/SmartPtrModeling.cpp | ||
---|---|---|
343 | Btw, if we do not have a helper yet to translate between these enums in the analyer, we should create one. Could you look for some other places in the analyzer code base to double check? |
clang/test/Analysis/smart-ptr.cpp | ||
---|---|---|
466 | Then perhaps I should leave a comment to indicate this possibility. |
First try at implementing conversion function from OverloadedOperatorKind to BinaryOperatorKind
clang/lib/StaticAnalyzer/Checkers/SmartPtrModeling.cpp | ||
---|---|---|
355–356 | Optimization: if one of the lookups succeeds, the other is unnecessary. A bit premature but I think we shouldn't lose too much elegance over this. | |
392–393 | This sounds like a super common functionality. Doesn't .get() do the same as well? I think it should be de-duplicated into a top-level method. | |
447–450 | Because these operators are pure boolean functions, a state split would be justified. It's pointless to call the operator unless both return values are feasible. I think you should do an assume() and transition into both states. It might also make sense to consult -analyzer-config eagerly-assume= before doing it but it sounds like this option will stays true forever and we might as well remove it. The spaceship operator is obviously an exceptional case. Invocation of the spaceship operator isn't a good indication that all three return values are feasible, might be only two. |
Sorry for not updating. Was down with fever.
This patch does *not* work now. operationKindFromOverloadedOperator is broken because the maps don't get populated. I am not entirely sure why this is happening.
Will try to fix tomorrow. @NoQ, @vsavchenko, @xazax.hun, @teemperor do you have a hunch as to why this may be happening?
I suspect that you might want to include OperationKinds.def instead of OperationKinds.h.
clang/lib/StaticAnalyzer/Checkers/SmartPtrModeling.cpp | ||
---|---|---|
447–450 | I think this comment has got displaced from its original location from subsequent updates. Could you please clarify? |
clang/lib/StaticAnalyzer/Checkers/SmartPtrModeling.cpp | ||
---|---|---|
447–450 | You can always go back in time by clicking the |<< button in the top-left corner of the comment ;) I'm trying to say that you should introduce a state split while evaluating the operators, not wait for the control flow operators to do that for you, exactly like the static analyzer currently does for the raw comparison operators. |
clang/lib/StaticAnalyzer/Checkers/SmartPtrModeling.cpp | ||
---|---|---|
409 | Do we want to create a new SVal in this case? Why returning S is insufficient? | |
417 | While this is smart (overwriting the state here), I think this is hard to follow. I'd prefer this lambda returning a pair. | |
436 | assume might not return a state (when the analyzer is smart enough to figure out one path is infeasible). While your code would probably work as is, as far as I remember we usually try to not call addTransition with null states. | |
clang/test/Analysis/smart-ptr.cpp | ||
484 | I do not see test cases where the path is actually split. Would you add some? |
clang/lib/StaticAnalyzer/Checkers/SmartPtrModeling.cpp | ||
---|---|---|
432 | Now that we've made a state split, it makes sense to bind a concrete true value or a concrete false value (depending on the state) instead of a symbolic value. This produces simpler symbolic equations for our constraint solver to handle. I.e., something like std::tie(TrueState, FalseState) = State->assume(...); if (TrueState) C.addTransition(TrueState->BindExpr(E, LCtx, SVB.makeTruthVal(true)); if (FalseState) C.addTransition(FalseState->BindExpr(E, LCtx, SVB.makeTruthVal(false)); | |
436 | Yeah I have a lot of anxiety about passing nulls there as well, have to look up the source code of these functions every time :/ |
clang/test/Analysis/smart-ptr.cpp | ||
---|---|---|
484 | Oops! I removed two tests emitting UNKNOWN and forgot to put them back in. |
I only have a few nits left.
clang/lib/StaticAnalyzer/Checkers/SmartPtrModeling.cpp | ||
---|---|---|
36 | I think you're not using this anymore. | |
317 | One good place to put this may be CheckerHelpers.h. This is where we dump all the stuff that's probably useful in multiple checkers but not in other places. I also wonder if you plan to support unary operators. The interesting part about them is that they are sometimes ambiguous to binary operators in their string representation, eg. -. | |
436 | C.getSValBuilder() is called three times now, you might want to stash it in a reference. There's probably not much performance benefit but the code should be easier to read this way. | |
clang/test/Analysis/Inputs/system-header-simulator-cxx.h | ||
983 | I think it's a good idea to test these cross-type comparison operators as well. IIUC they're handled exactly the same as their same-type counterparts but it may uncover occasional assertion failures. |
clang/lib/StaticAnalyzer/Checkers/SmartPtrModeling.cpp | ||
---|---|---|
317 | I guess passing a parameter to chose operator arity should work. |
clang/lib/StaticAnalyzer/Checkers/SmartPtrModeling.cpp | ||
---|---|---|
351 | Nit: couldn't you just return retrieveOrConjureInnerPtrVal(Reg, E, Type, C); instead of all the ceremony with std::tie? | |
366 | I think we might end up losing some information here. Imagine both makeSValFor conjuring a new symbol. Only one of the symbols will be stored in the state since capturing happend once, when you created the lambda. Maybe it is better to ask for a state in makeSValFor as a parameter instead of doing a lambda capture. Moreover, retrieveOrConjureInnerPtrVal will not even look at the state captured by makeSValFor. It will ask the CheckerContext to get the state, but that state is the state from the beginning of the function that does not have any of the modifications you made to it. |
I think you're not using this anymore.