This is an archive of the discontinued LLVM Phabricator instance.

[clang][dataflow] Relax validity assumptions in `UncheckedOptionalAccessModel`.
ClosedPublic

Authored by ymandel on Jan 27 2023, 5:22 AM.

Details

Summary

Currently, the interpretation of swap calls in the optional model assumes the
optional arguments are modeled (and therefore have valid storage locations and
values). This assumption is incorrect, for example, in the case of unmodeled
optional fields (which can be missing either value or location). This patch
relaxes these assumptions, to return rather than assert when either argument is
not modeled.

Diff Detail

Event Timeline

ymandel created this revision.Jan 27 2023, 5:22 AM
Herald added a project: Restricted Project. · View Herald Transcript
ymandel requested review of this revision.Jan 27 2023, 5:22 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 27 2023, 5:22 AM
sgatev added inline comments.Jan 27 2023, 7:36 AM
clang/lib/Analysis/FlowSensitive/Models/UncheckedOptionalAccessModel.cpp
524–540

What do you think about passing const StorageLocation* instead of const Expr&? This way we don't need to pass E1Skip.

534

Any reason to not set a fresh value for Loc1 in this case (similarly a fresh value for Loc2 below)?

ymandel marked 2 inline comments as done.Jan 27 2023, 7:49 AM
ymandel added inline comments.
clang/lib/Analysis/FlowSensitive/Models/UncheckedOptionalAccessModel.cpp
524–540

Sure, but means we'll be pushing the calls to getStorageLocation to callers. I'm fine with that, since it means a less janky API, but just want to call that out.

534

The new value for Loc1/Loc2 won't be connected to anything, so it won't bring any benefit to the modeling -- it will only make the code simpler.

ymandel marked 2 inline comments as done.Jan 31 2023, 8:54 AM

Gentle ping.

xazax.hun accepted this revision.Jan 31 2023, 3:07 PM

This change looks good to me. I wonder, however, whether the behavior should be parameterized in the future. E.g., whether the user of the analysis should be able to make a decision whether the analysis should be pessimistic or optimistic about unmodeled values.

This revision is now accepted and ready to land.Jan 31 2023, 3:07 PM

This change looks good to me. I wonder, however, whether the behavior should be parameterized in the future. E.g., whether the user of the analysis should be able to make a decision whether the analysis should be pessimistic or optimistic about unmodeled values.

Interesting idea. I think this goes along with other places where we are unsound. Here, we err on the side of soundness. but, in general, we should have a configuration mechanism for this. FWIW, the only reason we have uninitialized values at this point is recursive types. We also limit the depth of structs, but that should be removed given my recent patch to only model relevant fields. I have an idea for lazy initialization of values that I think could solve the recursion issue. Together, we could remove this concept of unmodeled values altogether from the framework.

This revision was landed with ongoing or failed builds.Feb 1 2023, 7:57 AM
This revision was automatically updated to reflect the committed changes.

This change looks good to me. I wonder, however, whether the behavior should be parameterized in the future. E.g., whether the user of the analysis should be able to make a decision whether the analysis should be pessimistic or optimistic about unmodeled values.

Interesting idea. I think this goes along with other places where we are unsound. Here, we err on the side of soundness. but, in general, we should have a configuration mechanism for this. FWIW, the only reason we have uninitialized values at this point is recursive types. We also limit the depth of structs, but that should be removed given my recent patch to only model relevant fields. I have an idea for lazy initialization of values that I think could solve the recursion issue. Together, we could remove this concept of unmodeled values altogether from the framework.

Oh, sounds great! I do think lazy initialization will be really valuable to reduce the number of unmodeled values, but not entirely sure if we can completely eliminate them. In case we end up creating new locations (different from the earlier ones) in every iteration of the loop it might be harder to reach a fixed point.

This change looks good to me. I wonder, however, whether the behavior should be parameterized in the future. E.g., whether the user of the analysis should be able to make a decision whether the analysis should be pessimistic or optimistic about unmodeled values.

Interesting idea. I think this goes along with other places where we are unsound. Here, we err on the side of soundness. but, in general, we should have a configuration mechanism for this. FWIW, the only reason we have uninitialized values at this point is recursive types. We also limit the depth of structs, but that should be removed given my recent patch to only model relevant fields. I have an idea for lazy initialization of values that I think could solve the recursion issue. Together, we could remove this concept of unmodeled values altogether from the framework.

Oh, sounds great! I do think lazy initialization will be really valuable to reduce the number of unmodeled values, but not entirely sure if we can completely eliminate them. In case we end up creating new locations (different from the earlier ones) in every iteration of the loop it might be harder to reach a fixed point.

True, and in some sense TopBoolValue is already that. If we extended Top to other value domains (like struct) the code would end up looking very similar, just spelled "top" instead of "nullptr". I'd prefer Top over nullptr, but it admittedly wouldn't change things in a fundamental way.