Add support for correlated branches to the std::optional dataflow model.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Thanks! Looks good overall, just the one question about loops.
clang/lib/Analysis/FlowSensitive/Models/UncheckedOptionalAccessModel.cpp | ||
---|---|---|
694–696 | What's the plan for loops? This will increase the size of the constraint on every iteration, preventing termination. |
clang/lib/Analysis/FlowSensitive/Models/UncheckedOptionalAccessModel.cpp | ||
---|---|---|
694–696 | I have nothing to add, just a brain dump. In this particular case, the analysis domain has infinite chains, the size of the Boolean formulas can increase indefinitely (at least syntactically). The canonical solution to deal with infinite chains is to do a widening operation such that the widened state over approximates both the state of the old and the new iteration. A very simple way of doing this would be to simply erase all knowledge we accumulated about the Boolean value. On the other hand, I wonder if it is often possible to do something smarter. There might be cases when the value is syntactically larger in the new iteration but actually it has the exact same truth table. The solver might be able to show this. On the other hand, if the loop generates new Boolean symbols in each iteration, we have little hope fore that. |
Actually, I think in most cases we want to be consistent on how to merge bool values. So I wonder whether instead of reimplementing the merge operation in this check we should just call a function that does the work. And the same function should be used within the engine to merge states after if statements and so on.
Agreed. I think we made a mistake with bools, to be honest, in supporting correlated conditions, given our lack of distinction between join and widen -- something I didn't appreciate fully at the time of the patch. I propose:
- In this patch, we go with a widening operation, but put the relevant logic in the core, so it can be reused for booleans in general.
- Follow up with a patch that uses the new logic for general booleans.
clang/lib/Analysis/FlowSensitive/Models/UncheckedOptionalAccessModel.cpp | ||
---|---|---|
694–696 |
Agreed. In our internal implementation, we have something like this -- it consults the solver to see if the two are logically equivalent. That loses the (subtle) support for correlated conditions. However, until we have more precise support for widening (that is, separate from merge), I think this is the only way to go to ensure that loop analysis terminates. |
Depending on how much simplicity are we willing to sacrifice, doing a merge here might be OK. We only need widening to ensure the convergence of loops. Some frameworks are doing the regular merge operator for joins in the CFG and only use the widening operator along the back edges. This way they can be a bit more precise while preserving convergence.
clang/lib/Analysis/FlowSensitive/Models/UncheckedOptionalAccessModel.cpp | ||
---|---|---|
694–696 | After discussing this, we decided to go for the internal implementation which is tested on a large codebase and commit to smarter merging that preserves flow condition information only after testing it extensively. I have updated the patch accordingly, please take a look. |
I love the tests (pretty cool examples of what can be handled). Thanks for the thoroughness!
clang/lib/Analysis/FlowSensitive/Models/UncheckedOptionalAccessModel.cpp | ||
---|---|---|
174 | I believe you can relax this to Value because setProperty is no longer specific to StructValue. | |
289–291 | I think this is no longer necessary now that we don't rely on StructValue for the getProperty call. | |
700 | MergedVal is enough here, assuming the changes suggested above. |
clang/lib/Analysis/FlowSensitive/Models/UncheckedOptionalAccessModel.cpp | ||
---|---|---|
174 | I did that intentionally because I still think it makes sense to assert that an optional is modeled as a StructValue. I haven't thought about where and how it'd be best to assert that though so I'll happily remove the casts for now. |
clang/lib/Analysis/FlowSensitive/Models/UncheckedOptionalAccessModel.cpp | ||
---|---|---|
174 | Sounds good. I had the same thought. But, it occurs to me that at this point, are we using the StructValue at all? If not, then the only reason we expect it to be a StructValue is because we know the optional type is a record type. But, that's not an assumption of our model. So, maybe we should be agnostic to the underlying representation? |
clang/lib/Analysis/FlowSensitive/Models/UncheckedOptionalAccessModel.cpp | ||
---|---|---|
288 | I wonder whether NonEmpty is clearer than Engaged. I think Engaged can also be OK, but we probably want to have a comment somewhere to explain the terminology. | |
700 | Would it make sense to have the converese? I.e., when both optionals are empty, create an empty optional here. While currently this modeling does not distinguish between unchecked and empty optionals it might be handy in the future. Although I am also happy with this as is, not adding any code that does not have any utility as of today. |
clang/lib/Analysis/FlowSensitive/Models/UncheckedOptionalAccessModel.cpp | ||
---|---|---|
174 | Agreed. Let's not assume a StructValue representation unless it's necessary. | |
288 | Sounds good. I went with isEmptyOptional and isNonEmptyOptional. | |
700 | Good idea! I think this can be useful. I added the code and a test case based on dead code: if (b) { opt = std::nullopt; } else { opt = std::nullopt; } if (opt.has_value()) { // unreachable } |
I believe you can relax this to Value because setProperty is no longer specific to StructValue.