This is an archive of the discontinued LLVM Phabricator instance.

[clang][dataflow] Add widening API and implement it for built-in boolean model.
ClosedPublic

Authored by ymandel on Nov 14 2022, 7:19 AM.

Details

Summary
  • Adds API support for widening of lattice elements and environments,
  • Updates the algorithm to apply widening where appropriate,
  • Implements widening for boolean values. In the process, moves the unsoundness of comparison from the default implementation of Environment::ValueModel::compare to model-specific handling inside DataflowEnvironment::equivalentTo. This change is intended to clarify the source and location of unsoundess.

This patch is a replacement for, and was based substantially on, https://reviews.llvm.org/D131645.

Diff Detail

Event Timeline

ymandel created this revision.Nov 14 2022, 7:19 AM
Herald added a project: Restricted Project. · View Herald Transcript
ymandel requested review of this revision.Nov 14 2022, 7:19 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 14 2022, 7:19 AM
xazax.hun added inline comments.Nov 14 2022, 2:06 PM
clang/include/clang/Analysis/FlowSensitive/DataflowAnalysis.h
68

We should document what is the return value for. Also I see LatticeJoinEffect instead of bool in the code, but I might be confused.

106

I wonder if LatticeJoinEffect is the right name if we also use this for widening. (Also, are we in the process of removing these?)

clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h
140

First, I was confused, because I thought they could be equivalent as well. But I guess this code is only called for values that are not considered equivalent. Documenting this is really useful.

clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
374–378

Shouldn't we have a simple copy ctor from PrefEnv that could populate all these fields?

Also, this makes me wonder if we actually want to move some of these out from environment, since these presumably would not change between basic blocks.

ymandel updated this revision to Diff 475468.Nov 15 2022, 7:12 AM

Rewrite widen-related comments.

ymandel updated this revision to Diff 475469.Nov 15 2022, 7:15 AM
ymandel marked an inline comment as done.

fix typo

li.zhe.hua added inline comments.Nov 15 2022, 7:19 AM
clang/include/clang/Analysis/FlowSensitive/DataflowAnalysis.h
164

I think this is missing a word.

clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h
229
ymandel updated this revision to Diff 475478.Nov 15 2022, 7:46 AM
ymandel marked 2 inline comments as done.

address comments

ymandel marked 3 inline comments as done and 2 inline comments as done.Nov 15 2022, 7:49 AM
ymandel added inline comments.
clang/include/clang/Analysis/FlowSensitive/DataflowAnalysis.h
68

No, you're right. :) I've completely rewritten the comments -- they were outdated and wrong. They now match what we have in TypeErasedDatafllowAnalysis. Most importantly, we've dropped any mention of defaulting to join which is wrong both in fact and conceptually. Our targeted use of widen means it should not default to join -- that would be pointless, since it's prerequisites would guarantee that join(prev, cur) = cur, making the join pointless. It instead defaults to an equivalence check.

106

I had the same thought. How about just LatticeEffect? Open to other suggestions as well. Either way, though, I should update in a separate patch in case that breaks something unexpected.

As for removing -- yes, we should remove from join, because we never use it. But, it actually makes sense here.

clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h
229

I actually just deleted most of the comment -- I don't think the details were helpful.

clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
374–378

Shouldn't we have a simple copy ctor from PrefEnv that could populate all these fields?

We *do* have such a copy constructor. The problem is that it's overkill because we want to build LocToVal ourselves by point-wise widening of the elements. For that matter, I wish we could share, rather than copy, DeclToLoc and ExprToLoc but I don't know of any way to do that. Alternatively: is it possible that we can update in place and don't need a separate WidenedEnv?

Also, this makes me wonder if we actually want to move some of these out from environment, since these presumably would not change between basic blocks.

Agreed. Added a fixme with an associated issue to track.

xazax.hun added inline comments.Nov 15 2022, 9:15 AM
clang/include/clang/Analysis/FlowSensitive/DataflowAnalysis.h
106

LatticeEffect sounds good to me.

114

Maybe this is another case where we want to mark the whole class final instead of the individual methods? It is fine to address this in a separate PR.

135

Is there ever a reason to instantiate Rank1? If no, should we do something (like make its ctor private and friends with Rank0?). Although possibly not important enough to worth the hassle.

clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h
468
clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
374–378

For that matter, I wish we could share, rather than copy, DeclToLoc and ExprToLoc.

One way to maximize sharing (and making copies free) is to use the immutable datastructures in LLVM: https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/ADT/ImmutableMap.h

Unfortunately, these have their own costs so it might not be easy to decide when it is worth to use them. The Clang Static Analyzer was designed from the ground up with immutable data structures in mind.

On the other hand, I see that you want a more ad-hoc/opportunistic sharing. And I actually wonder if we need any sharing at all. If we are 100% sure we will do the widening, do we actually need to preserve the previous environment? Couldn't we "consume" it and move certain fields out instead of doing the more expensive copy operation when it makes sense?

424–425

Isn't some of these tautologically false, as we only read from them?

ymandel updated this revision to Diff 475544.Nov 15 2022, 11:45 AM
ymandel marked 7 inline comments as done.

reshape widen to update in place.

ymandel marked an inline comment as done.Nov 15 2022, 11:50 AM
ymandel added inline comments.
clang/include/clang/Analysis/FlowSensitive/DataflowAnalysis.h
106

Added FIXME

114

Unfortunately not, since this class is used for CRTP.

135

No reason to instantiate it that I can see, but since it's a already private to this class, I think it's good enough. But, I just borrowed this from Eric, so I'm not an expert in this pattern. Happy to take suggested improvements.

clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
374–378

We considered immutable data structures early on and decided against them, based on guesses about costs. I'm sure this is worth revisiting, but probably not outside of a comprehensive performance analysis of the framework.

Regarding this API in particular, I was basing off the needs of join, which very much requires a distinct new environment. But, I think that's the wrong model, as I've better understood what we're doing here. That catch22 for join is that you need to create new flow condition which joins the 2 incoming FCs, but you also need to (potentially) read the old FC and update the new FC during the pointwise merge of the store. So, if you update in place, neither update order (store and FC) works. If you update the FC first, then the old FC is inaccesible, but if you update the store first, then the additions to the FC end up mishandled when you create the new FC from the old ones.

AFAICT, we don't have that issue here, since we're not applying any widening to the FC. So, we can modify the FC during store widening without ill affect. I went ahead and made this change, but let me know if you see a flaw in my reasoning.

424–425

Well, Model's widen function could change them, as much as that seems wrong. I suppose we could specify in the API that it shouldn't, or we could further restructure the environment so that models don't have access to these fields. Either way, if we want these to be assertable, we'll need an API change.

But, more to the point, since I've changed it to update-in-place, we'll now also capture any changes that may have occured between Prev and Current. Those should converge pretty quickly, because they related to static features of the code, so I think this is a reasonable check now.

ymandel updated this revision to Diff 475548.Nov 15 2022, 11:51 AM
ymandel marked an inline comment as done.

tweaks

xazax.hun accepted this revision.Nov 15 2022, 11:58 AM

LGTM, thanks!

clang/include/clang/Analysis/FlowSensitive/DataflowAnalysis.h
114

Whoops, indeed, I missed that.

135

Let's leave as is.

This revision is now accepted and ready to land.Nov 15 2022, 11:58 AM

Does anyone else on this review thread want to review? Gabor's accepted the revision so, if not, I'll go ahead and commit.
thx

sgatev accepted this revision.Nov 22 2022, 7:23 AM
sgatev added inline comments.
clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h
471
clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
420–421
ymandel updated this revision to Diff 477200.Nov 22 2022, 7:45 AM

address comments

This revision was landed with ongoing or failed builds.Nov 22 2022, 8:10 AM
This revision was automatically updated to reflect the committed changes.
ymandel marked 2 inline comments as done.