This is an archive of the discontinued LLVM Phabricator instance.

[dataflow] Replace most BoolValue subclasses with references to Formula (and AtomicBoolValue => Atom and BoolValue => Formula where appropriate)
ClosedPublic

Authored by sammccall on Jun 21 2023, 2:31 PM.

Details

Summary

This properly frees the Value hierarchy from managing boolean formulas.

We still distinguish AtomicBoolValue; this type is used in client code.
However we expect to convert such uses to BoolValue (where the
distinction is not needed) or Atom (where atomic identity is intended),
and then fold AtomicBoolValue into FormulaBoolValue.

We also distinguish TopBoolValue; this has distinct rules for
widen/join/equivalence, and top-ness is not represented in Formula.
It'd be nice to find a cleaner representation (e.g. the absence of a
formula), but no immediate plans.

For now, BoolValues with the same Formula are deduplicated. This doesn't
seem desirable, as Values are mutable by their creators (properties).
We can probably drop this for FormulaBoolValue immediately (not in this
patch, to isolate changes). For AtomicBoolValue we first need to update
clients to stop using value pointers for atom identity.

The data structures around flow conditions are updated:

  • flow condition tokens are Atom, rather than AtomicBoolValue*
  • conditions are Formula, rather than BoolValue

Most APIs were changed directly, some with many clients had a
new version added and the existing one deprecated.

The factories for BoolValues in Environment keep their existing
signatures for now (e.g. makeOr(BoolValue, BoolValue) => BoolValue)
and are not deprecated. These have very many clients and finding the
most ergonomic API & migration path still needs some thought.

Diff Detail

Event Timeline

sammccall created this revision.Jun 21 2023, 2:31 PM
Herald added a project: Restricted Project. · View Herald Transcript
sammccall requested review of this revision.Jun 21 2023, 2:31 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 21 2023, 2:31 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript

Here we start to see benefits: Value becomes less reliant on inheritance, flow conditions are no longer Values that can uselessly bear properties, atomic variables are numbered in order of creation, we'll soon be able to have distinct BoolValues with the same formula (but different properties).

I think I could separate the Value/Arena changes from the ones to use Atom/Formula in DAContext if you think it's useful (the drawback is some risk of ending up in merge hell...)

Just some top-level comments to begin with.

Here we start to see benefits: Value becomes less reliant on inheritance, flow conditions are no longer Values that can uselessly bear properties, atomic variables are numbered in order of creation, we'll soon be able to have distinct BoolValues with the same formula (but different properties).

I like where this is going. Specifically, I like that we will soon no longer be relying in BoolValue object identity and will be able to have different BoolValues with the same formula. I'm working on a similar change for StructValue, so I'm well aware of how problematic it is to share the same Value but then modify properties on it -- it seems like a potential source of really puzzling bugs.

I think I could separate the Value/Arena changes from the ones to use Atom/Formula in DAContext if you think it's useful (the drawback is some risk of ending up in merge hell...)

I don't think that's necessary -- this patch is large in terms of numbers of lines changed, but when you eliminate the mechanical changes, there's not too much going on.

gribozavr2 accepted this revision.Jun 30 2023, 11:49 AM
gribozavr2 added inline comments.
clang/include/clang/Analysis/FlowSensitive/Arena.h
61

Per our offline conversation, do you still believe in this fixme?

clang/include/clang/Analysis/FlowSensitive/Value.h
132
146
This revision is now accepted and ready to land.Jun 30 2023, 11:49 AM
sammccall marked 3 inline comments as done.Jul 5 2023, 4:53 AM
sammccall added inline comments.
clang/include/clang/Analysis/FlowSensitive/Arena.h
61

No, I'm no longer convinced "unsound" is true.

However, I do think there's something to be fixed: if we really believe in it, we should be interning other types of values too (or have a clear idea about why bools are special).

And I think I still lean towards removing it: it looks like a power-vs-complexity tradeoff in the Value concept, and I suspect the historical reason to intern values was to intern formulas, not because we needed the power.

Reworded the fixme to say we should decide and be more consistent - text doesn't take a position on which way we should decide.

This revision was landed with ongoing or failed builds.Jul 5 2023, 4:56 AM
This revision was automatically updated to reflect the committed changes.
sammccall marked an inline comment as done.
uabelho added a subscriber: uabelho.Jul 7 2023, 3:20 AM
uabelho added inline comments.
clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h
521

There are still uses of this method in-tree. I get the following error (and many more) when I compile trunk now with -Werror:

../../clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp:148:9: error: 'addToFlowCondition' is deprecated: Use Formula version instead [-Werror,-Wdeprecated-declarations]
    Env.addToFlowCondition(*Val);
        ^
../../clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h:532:3: note: 'addToFlowCondition' has been explicitly marked deprecated here
  LLVM_DEPRECATED("Use Formula version instead", "")
  ^
../include/llvm/Support/Compiler.h:155:50: note: expanded from macro 'LLVM_DEPRECATED'
#define LLVM_DEPRECATED(MSG, FIX) __attribute__((deprecated(MSG, FIX)))
                                                 ^
1 error generated.
sammccall added inline comments.Jul 7 2023, 4:58 AM
clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h
521

Sure, I plan to clean these up once this change sticks.

Do you think -Wdeprecated -Werror is a build config we should keep working at all times?
This would mean you we can't use deprecation to migrate internal callers over multiple patches, essentially

So building with -Wno-error=deprecated-declarations or so seems to make more sense to me, but I can't see anything written either way. (Nor buildbot mail)

uabelho added inline comments.Jul 7 2023, 5:08 AM
clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h
521

There are bots failing now at least:
https://lab.llvm.org/buildbot/#/builders/57
https://lab.llvm.org/buildbot/#/builders/214

I *think* deprecation is mainly used to let downstream adapt? And when deprecating stuff in-tree code is usually fixed at once?

sammccall added inline comments.Jul 7 2023, 5:36 AM
clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h
521

Fair enough, thanks for the bot links

I don't think expanding the scope of the change for now is a good idea.
9d5cfed2443aa294e67bea694d5aab02c40fa278 replaces [[deprecated]] with a comment.

uabelho added inline comments.Jul 7 2023, 5:58 AM
clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h
521

Thank you!