This is an archive of the discontinued LLVM Phabricator instance.

[clang][dataflow] Add support for correlation of boolean (tracked) values
ClosedPublic

Authored by ymandel on Mar 31 2022, 10:57 AM.

Details

Summary

This patch extends the join logic for environments to explicitly handle
boolean values. It creates the disjunction of both source values, guarded by the
respective flow conditions from each input environment. This change allows the
framework to reason about boolean correlations across multiple branches (and
subsequent joins).

Diff Detail

Event Timeline

ymandel created this revision.Mar 31 2022, 10:57 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 31 2022, 10:57 AM
ymandel requested review of this revision.Mar 31 2022, 10:57 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 31 2022, 10:57 AM
xazax.hun added inline comments.Mar 31 2022, 1:26 PM
clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
79

Hmm, interesting.
I think we view every boolean formula at a certain program point implicitly as FlowConditionAtThatPoint && Formula. And the flow condition at a program point should already be a disjunction of its predecessors.

So it would be interpreted as: (FlowConditionPred1 || FlowConditionPred2) && (FormulaAtPred1 || FormulaAtPred2).
While this is great, this is not the strongest condition we could derive.
(FlowConditionPred1 && FormulaAtPred1) || (FormulaAtPred2 && FlowConditionPred2) created by this code snippet is stronger which is great.

My main concern is whether we would end up seeing an exponential explosion in the size of these formulas in the number of branches following each other in a sequence.

sgatev accepted this revision.Apr 1 2022, 12:55 AM
sgatev added inline comments.
clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
79

Yeap, I agree this is suboptimal and I believe I'm the one to blame for introducing it downstream.

I wonder if we can represent the flow condition of each environment using a bool atom and have a mapping of bi-conditionals between flow condition atoms and flow condition constraints. Something like:

FC1 <=> C1 ^ C2
FC2 <=> C2 ^ C3 ^ C4
FC3 <=> (FC1 v FC2) ^ C5
...

We can use that to simplify the formulas here and in joinConstraints. The mapping can be stored in DataflowAnalysisContext. We can track dependencies between flow conditions (e.g. above FC3 depends on FC1 and FC2) and modify flowConditionImplies to construct a formula that includes the bi-conditionals for all flow condition atoms in the transitive set before invoking the solver.

I suggest putting the optimization in its own patch. I'd love to look into it right after this patch is submitted if both of you think it makes sense on a high level.

This revision is now accepted and ready to land.Apr 1 2022, 12:55 AM
ymandel added inline comments.Apr 1 2022, 4:28 AM
clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
79

This sounds good to me. That said, I'm not sure how often we'd expect this to be an issue in practice, since, IIUC, this specialized merge only occurs when the value is handled differently in the two branches. So, a series of branches alone won't trigger the exponential behavior.

xazax.hun accepted this revision.Apr 1 2022, 8:00 AM
xazax.hun added inline comments.
clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
79

Fair point. It might never cause a problem in practice. Since we already have one proposal how to try to fix this if a problem ever surfaces, I'd love to have a comment about this in the code. But I think this is something that probably does not need to be addressed in the near future.

ymandel updated this revision to Diff 419764.Apr 1 2022, 8:12 AM

Add comment about future optimization potential.

ymandel marked an inline comment as done.Apr 1 2022, 8:14 AM
ymandel added inline comments.
clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
79

Good idea -- I reformatted essentially sgatev's response above as a comment in the code.

This revision was automatically updated to reflect the committed changes.
ymandel marked an inline comment as done.