This patch adds partial support for tracking (i.e. modeling) the contents of an
optional value. Specifically, it supports tracking the value after it is
accessed. We leave tracking constructed/assigned contents to a future patch.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Overall looks good to me. I am curious what will the strategy be to properly support construction. Do you plan to introduce a customization point to Env.createValue to give checks/models a way to set properties up? Or do you have something else in mind?
clang/lib/Analysis/FlowSensitive/Models/UncheckedOptionalAccessModel.cpp | ||
---|---|---|
229 | I recall some problems with convergence due to creating fresh locations lazily in multiple branches. I wonder if doubling down before finding a proper solution could make it harder to fix this in the near future. If this is just a temporary solution until construction is better supported, it might be fine. | |
301 | I wonder how well this strategy works in practice. If we have many unwrap calls to the same optional, we might end up emitting lots of diagnostics. An alternative approach is to assume that the optional has a value after the first unwrap. This would ensure that we only report a particular problem once and can reduce the noise. |
If we want to go the eager route, than I think that's the "right" solution. Anything else is brittle, since it is easy to miss a construction point. However, I'm more inclined to move the whole framework to a lazy initialization process, if we can devise one.
clang/lib/Analysis/FlowSensitive/Models/UncheckedOptionalAccessModel.cpp | ||
---|---|---|
229 | Excellent question. Yes, you're right about the convergence issue this falls into the same category. That said, we intentionally reuse the same location in all branches, which mitigates that issue in this case -- we are at least assured that the two branches will have the contents value at the same location. In fact, the FIXME I added is wrong (and I've adjusted it accordingly). The property will be shared by all blocks that share the same value representing the Optional (which for the most part is done eagerly). To your larger point, about the advisability of this approach, even for the short term:
| |
301 | I see your point, and agree that there does seem room for significantly reducing warnings, even if all are correct. But, in practice, we haven't gotten complaints (which is a little surprising, actually). I wonder if it would be good to pair the report with the surrounding compound statement, and limit to one per compound statement. I fear that shutting it off entirely after the first issue might be _too_ quiet. Either way, this is definitely something that we'll be actively tweaking. |
clang/lib/Analysis/FlowSensitive/Models/UncheckedOptionalAccessModel.cpp | ||
---|---|---|
180–181 | Why is this necessary? Do we really need to create a new value on each getHasValue call? Can we centralize initialization to avoid this? I'm worried that "has_value" and "value" initialization isn't coordinated. | |
229 | Perhaps call this maybeInitializeOptionalValueMember? | |
229 | Let's pass the type instead of Expr, if that's the only thing we need. | |
232–234 | It's unclear to me why we "need" a StorageLocation. Representing the value as a ReferenceValue seems good to me, but there are other options. What issues are we running into if we remove the indirection? | |
241–246 | So we don't lose the ReferenceValue, but we do lose its pointee? Is that the behavior we want when merging? | |
291 | Shouldn't we use getHasValue here? | |
clang/unittests/Analysis/FlowSensitive/UncheckedOptionalAccessModelTest.cpp | ||
2188–2260 | Do we really need tests for all members that check/access/reset the content? These are already covered in tests above. | |
2280 | ||
2287 | Do we really need this one or is it already covered by one of the tests in this file? |
I'm having trouble with arcanist -- arc diff is crashing, but wanted to respond in the meantime to your questions. I hope I'll be able to actually upload the new diff soon...
clang/lib/Analysis/FlowSensitive/Models/UncheckedOptionalAccessModel.cpp | ||
---|---|---|
180–181 |
We don't create a new one on each call, only when none is already populated. As for centralizing -- we could, but I'd like to move to lazy initialization anyhow, so it seemed less than ideal to put effort into centralizing at this point. | |
232–234 |
We can't address it. Compare with a normal field, where the corresponding AggregateStorageLocation has a child that maps to a storage location for the field. We don't support properties on AggregateStorageLocation, so the location has to exist somewhere. What other options do you see? | |
241–246 |
Correct. No, that's not the behavior we want, hence the FIXME that follows. :) But, it at least accounts for the other two situations, so its a win overall. | |
291 |
Not quite:
| |
clang/unittests/Analysis/FlowSensitive/UncheckedOptionalAccessModelTest.cpp | ||
2188–2260 | Great point. I rethought the tests in general -- I'd mostly been copying them over, without carefully rethinking them. I've reduced the number of tests and added comments explaining their purposes. Essentially, we need to test that the synthetic value can be used like a normal field. | |
2280 | No longer relevant. | |
2287 | Covered. |
address reviewer comments and fix bug in how the nested value as _re_created -- now uses the loc's type rather than the expression's type (which doesn't always match exactly).
Why is this necessary? Do we really need to create a new value on each getHasValue call? Can we centralize initialization to avoid this? I'm worried that "has_value" and "value" initialization isn't coordinated.