This is an archive of the discontinued LLVM Phabricator instance.

[clang][dataflow] Use dedicated tokens to encode which branch was taken
AbandonedPublicDraft

Authored by samestep on Jul 7 2022, 6:49 AM.

Details

Reviewers
NoQ
Summary

This patch attempts to resolve a FIXME in DataflowEnvironment.cpp about merging BoolValues when the two flow conditions are not mutually exclusive. The issue here is that we are using the truth value of each flow condition as a proxy for the idea of whether or not we took that branch, which breaks down on backedges.

To try to resolve the issue, this patch creates two additional AtomicBoolValues to directly represent which branch we took, and uses those directly within mergeDistinctValues. As currently written, this doesn't quite work: it compiles, but 20 tests from ClangAnalysisFlowSensitiveTests fail. This patch is therefore simply meant as a starting point for discussion on how to properly resolve this issue.

Diff Detail

Event Timeline

samestep created this revision.Jul 7 2022, 6:49 AM
Herald added a project: Restricted Project. · View Herald Transcript
samestep edited the summary of this revision. (Show Details)Jul 7 2022, 6:53 AM
ymandel added a subscriber: ymandel.Jul 8 2022, 6:31 AM

Can you give some examples (or the whole list) of failing tests?

clang/lib/Analysis/FlowSensitive/DataflowAnalysisContext.cpp
154–156

Should we also include TookFirstBranchToken \/ TookSecondBranchToken? If so, i suppose we could also rephrase as (NOT First AND Second) OR (First AND NOT Second) which may get at the point a bit more directly.

Can you give some examples (or the whole list) of failing tests?

Yeah, I was going to do that originally but the log was too long so I couldn't scroll all the way up in my terminal history. Anyways, here's a pastebin: https://pastebin.com/uir7YKNL

clang/lib/Analysis/FlowSensitive/DataflowAnalysisContext.cpp
154–156

That should already be covered by the other constraint added below, right? Although actually, now I'm thinking that since we have exactly two choices, maybe we should just use one single TookFirstBranchToken, where "true" means we took the first branch and "false" means we took the second branch.

Can you give some examples (or the whole list) of failing tests?

Yeah, I was going to do that originally but the log was too long so I couldn't scroll all the way up in my terminal history. Anyways, here's a pastebin: https://pastebin.com/uir7YKNL

Based on those failures, looks like it broke something fundamental. Basically, lots of conditions are now true that shouldn't be. I'd start with the FlowConditionTest cases since those are simple and their failure will probably indicate the problem pretty quickly. But, I'd first recommend making the single-bool change you mentioned.

clang/lib/Analysis/FlowSensitive/DataflowAnalysisContext.cpp
154–156

right. I hadn't thought it through. that said, I agree with your intuition of just using one boolean.

sgatev added a subscriber: sgatev.Jul 15 2022, 2:43 AM
sgatev added inline comments.
clang/include/clang/Analysis/FlowSensitive/DataflowAnalysisContext.h
219

Let's not call the new params "tokens" since they are plain bool values, not flow condition tokens.

clang/lib/Analysis/FlowSensitive/DataflowAnalysisContext.cpp
151–152

This shouldn't be necessary - TookFirstBranchToken and TookSecondBranchToken are not flow condition tokens.

samestep abandoned this revision.Jul 21 2022, 7:01 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 21 2022, 7:01 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript