This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
lib/Analysis/FlowSensitive/
-
Analysis/
-
FlowSensitive/
10/15
TypeErasedDataflowAnalysis.cpp
-
unittests/Analysis/FlowSensitive/
-
Analysis/
-
FlowSensitive/
3/4
TypeErasedDataflowAnalysisTest.cpp

Differential D131646

[clang][dataflow] Restructure loops to call widen on back edges
AcceptedPublic

Authored by li.zhe.hua on Aug 10 2022, 9:58 PM.

Download Raw Diff

Details

Reviewers

NoQ
ymandel
xazax.hun
sgatev

Summary

When navigating a loop block, we call the lattice's widen operator,
which gives a lattice of infinite height the opportunity to reach
convergence.

As we enter the loop, we store the block state in the back edge block,
which represents the state after the zeroth iteration. Then, after
each loop iteration, we widen the previous iteration's state with the
new iteration's state.

Tracking issue: #56931

Depends on D131645

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

li.zhe.hua created this revision.Aug 10 2022, 9:58 PM

Herald added a reviewer: NoQ. · View Herald TranscriptAug 10 2022, 9:58 PM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: martong, xazax.hun. · View Herald Transcript

li.zhe.hua requested review of this revision.Aug 10 2022, 9:58 PM

Herald added a project: Restricted Project. · View Herald TranscriptAug 10 2022, 9:58 PM

Herald added a subscriber: cfe-commits. · View Herald Transcript

li.zhe.hua mentioned this in D131644: [clang][dataflow] Don't skip the entry block.Aug 10 2022, 10:04 PM

li.zhe.hua edited the summary of this revision. (Show Details)

li.zhe.hua added reviewers: ymandel, xazax.hun.Aug 10 2022, 10:07 PM

Herald added a subscriber: rnkovacs. · View Herald TranscriptAug 10 2022, 10:07 PM

Harbormaster completed remote builds in B180588: Diff 451720.Aug 10 2022, 10:44 PM

ymandel added a reviewer: sgatev.Aug 11 2022, 5:13 AM

ymandel added a subscriber: gribozavr.

Nice!

clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp
180	Might it be worth simply returning the backedge when you find it? Or is the assertion (above) sufficiently important to keep it as is?
clang/unittests/Analysis/FlowSensitive/TypeErasedDataflowAnalysisTest.cpp
276	Might a (googletest) assertion here be better than `llvm::cantFail`? I would think that this line is the crux of checking whether it converges or not.

This revision is now accepted and ready to land.Aug 11 2022, 5:47 AM

Address comments

clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp
180	So, I couldn't prove to myself that a block couldn't have two backedge predecessors from staring at the CFG code, but it seems conceptually unlikely enough that I'm OK simplifying this.
clang/unittests/Analysis/FlowSensitive/TypeErasedDataflowAnalysisTest.cpp
276	Ah, yes. I hadn't thought to look for `llvm::Expected` matchers; good call.

Harbormaster completed remote builds in B180680: Diff 451861.Aug 11 2022, 8:49 AM

li.zhe.hua mentioned this in D131645: [clang][dataflow] Allow user-provided lattices to provide a widen operator.Aug 11 2022, 9:39 AM

xazax.hun added inline comments.Aug 11 2022, 11:18 AM

clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp
168–169	Is this also true when we have multiple `continue` statements in the loop?
226–232	Could you elaborate on this? Let's consider this loop: Pred \| v LoopHeader <---BackEdge Do we ignore the state coming from `Pred` on purpose? Is that sound? I would expect the analysis to always compute `join(PredState, BackEdgeState)`, and I would expect the widening to happen between the previous iteration of `BackEdgeState` and the current iteration of `BackEdgeState`. So, I wonder if we already invoke the widening operator along back edges, wouldn't the regular logic work just fine? Do I miss something?

Fix incorrect assumption that back edge blocks have an empty body.

li.zhe.hua marked an inline comment as done.Aug 11 2022, 12:15 PM

li.zhe.hua added inline comments.

clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp
168–169	Yes. The end of the loop, and each of the `continue` statements, target the back edge block. They all get funneled through that back edge to `Block`, such that `Block` only has two predecessors. However, I haven't verified this in the CFG code, only by not finding a counterexample.
226–232	Do we ignore the state coming from `Pred` on purpose? Is that sound? We don't, and this is what the comment // For the first iteration of loop, the "zeroth" iteration state is set up by // `prepareBackEdges`. failed to explain. After transferring `PredState`, we copy `PredState` into `BackEdgeState`, which is done in `prepareBackEdges`. I would expect the analysis to always compute `join(PredState, BackEdgeState)` I'm not sure I see that we should always join `PredState` and `BackEdgeState`. Execution doesn't go from `Pred` into the Nth iteration of the loop, it only goes from `Pred` into the first iteration of the loop, e.g. the predecessor for the 4th iteration of the loop is only the back-edge from the 3rd iteration of the loop, not `Pred`. Let me know if this makes sense.

xazax.hun added inline comments.Aug 11 2022, 12:35 PM

clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp
226–232	I'm not sure I see that we should always join `PredState` and `BackEdgeState`. Execution doesn't go from `Pred` into the Nth iteration of the loop, it only goes from `Pred` into the first iteration of the loop, e.g. the predecessor for the 4th iteration of the loop is only the back-edge from the 3rd iteration of the loop, not `Pred`. Let me know if this makes sense. The analysis state of the dataflow analysis supposed to overestimate all of the paths. Consider the following loop and imagine we want to calculate intervals for integer variables: int i = 0; while (...) { [[program point A]]; ++i; } During the analysis of this loop, the value `i ==0` flows to `[[program point A]]`. This is the motivation to join the state from the back edge and from PredState. As far as I understand, you want to achieve this by copying PredState to the BackEdgeState before the first iteration. But this also means that we will use the widening operator to combine PredState with the state after N iterations instead of the regular join operation. I am not entirely sure whether these two approaches always produce the same results.

xazax.hun added inline comments.Aug 11 2022, 12:58 PM

clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp
226–232	A contrived example (just came up with this in a couple minutes so feel free to double check if it is OK): Consider: int i = 0; while (...) { if (i == 0) i = -2; ++i; } Some states: PredState = i : [0, 0] BackEdgeState (after first iteration) = i : [-1, -1] And the results of join vs widen: PredState.join(BackEdgeState) = i : [-1, 0] PredState.widen(BackEdge) = i : [-inf, 0] The extra widening with the PredState can result in extra imprecision.

Harbormaster completed remote builds in B180731: Diff 451926.Aug 11 2022, 1:13 PM

sgatev added inline comments.Aug 12 2022, 4:08 AM

clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp
157	Let's start function comments with `///` throughout the file.
164
167–168
168–169	Does that hold for back edges stemming from `goto` statements?
clang/unittests/Analysis/FlowSensitive/TypeErasedDataflowAnalysisTest.cpp
270	There's already a set of "widening" tests - http://google3/third_party/llvm/llvm-project/clang/unittests/Analysis/FlowSensitive/TypeErasedDataflowAnalysisTest.cpp;l=527-712;rcl=462638952 What do you think about refactoring those so that we have tests that exercise the framework with both `join` and `widen`?

Address comments

clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp
168–169	Ah, `goto` does throw a wrench in all of this. Specifically, it is problematic only for `do ... while` loops, where the labeled statement is the first statement of the `do` block. I took out the sentence, took out the `assert`, and added a `FIXME` for `prepareBackEdges`.
226–232	So, my understanding is that widening introduces imprecision as a trade-off for convergence. Yes, in the contrived example, joining converges after a few iterations, but in other cases it never converges. Looking at the Rival and Yi book, there are no intervening joins with the predecessor for the first loop as we analyze the loop. This is roughly where I drew from in terms of this implementation. analysis(iter{p}, a) = { R <- a; repeat T <- R; R <- widen(R, analysis(p, R)); until inclusion(R, T) return T;
clang/unittests/Analysis/FlowSensitive/TypeErasedDataflowAnalysisTest.cpp
270	This commit implements widening only for the lattice. Those tests use the `NoopLattice`, so their behavior is not expected to change with this commit. Splitting `merge` into a `join` and `widen` is out-of-scope for this commit, and is what I am hoping to work on next.

Harbormaster completed remote builds in B182285: Diff 454089.Aug 19 2022, 1:41 PM

xazax.hun added inline comments.Aug 19 2022, 2:20 PM

clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp
226–232	So, my understanding is that widening introduces imprecision as a trade-off for convergence. Yes, in the contrived example, joining converges after a few iterations, but in other cases it never converges. In my example I do not advocate for NOT doing widening. I am advocating for not applying the widening operator with the loop pre-state and the loop iteration state as operands. My example demonstrates how that introduces additional imprecision. Not using the widening operator for those two states, only between loop iterations would still ensure convergence, but would give you more precision. My argument is, using widening in that step is not required for the convergence but we do pay for it in precision. Or do you have an example where you need widening for those two specific states? Looking at the Rival and Yi book, there are no intervening joins with the predecessor for the first loop as we analyze the loop. This is roughly where I drew from in terms of this implementation. Later in the book, they talk about loop unrolling where we don't use the widening operator in the first couple of iterations. The same imprecision I was talking about here could also be solved by unrolling the first iteration. I believe the book is might not talk about this technique because the unrolling subsumes it. The main reason I am advocating for this, because I believe if we are willing to accept the join there, we would no longer need to prepare the back edges and it would simplify the implementation while improving the precision of the analysis.

Revision Contents

Path

Size

clang/

lib/

Analysis/

FlowSensitive/

TypeErasedDataflowAnalysis.cpp

77 lines

unittests/

Analysis/

FlowSensitive/

TypeErasedDataflowAnalysisTest.cpp

71 lines

Diff 454089

clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp

Show First 20 Lines • Show All 148 Lines • ▼ Show 20 Lines private:

} }

const StmtToEnvMap &StmtToEnv; const StmtToEnvMap &StmtToEnv;

Environment &Env; Environment &Env;

int BlockSuccIdx; int BlockSuccIdx;

TransferOptions TransferOpts; TransferOptions TransferOpts;

}; };

/// Returns whether `Block` is a "back edge" in the CFG. Such a block has only

sgatevUnsubmitted

Done

Let's start function comments with /// throughout the file.

sgatev: Let's start function comments with `///` throughout the file.

/// one successor, the start of the loop.

static bool isBackEdge(const CFGBlock *Block) {

assert(Block != nullptr);

return Block->LoopTarget != nullptr;

}

/// Returns the predecessor to `Block` that comes from a "back edge", if one

sgatevUnsubmitted

Done

return Block->LoopTarget != nullptr;

}

- // Returns the predecessor to `Block` that is a "back edge", if one exists.

+ // Returns the predecessor to `Block` that comes from a "back edge", if one exists.

// If this function returns a non-null pointer, that means `Block` dominates the

sgatev:

/// exists.

///

/// If this function returns a non-null pointer, that means `Block` dominates

/// the back edge block. That is, all paths from the entry block to the back

sgatevUnsubmitted

Done

// If this function returns a non-null pointer, that means `Block` dominates the

- // back edge block. (That is, all paths from the entry block to the back edge

- // block must go through `Block`.) It also means that there are only two

+ // back edge block. That is, all paths from the entry block to the back edge

+ // block must go through `Block`. It also means that there are only two

// predecessors; the other is along the path from the entry block to `Block`.

sgatev:

/// edge block must go through `Block`.

xazax.hunUnsubmitted

Done

Is this also true when we have multiple continue statements in the loop?

xazax.hun: Is this also true when we have multiple `continue` statements in the loop?

li.zhe.huaAuthorUnsubmitted

Done

Yes. The end of the loop, and each of the continue statements, target the back edge block. They all get funneled through that back edge to Block, such that Block only has two predecessors. However, I haven't verified this in the CFG code, only by not finding a counterexample.

li.zhe.hua: Yes. The end of the loop, and each of the `continue` statements, target the back edge block.

sgatevUnsubmitted

Not Done

Does that hold for back edges stemming from goto statements?

sgatev: Does that hold for back edges stemming from `goto` statements?

li.zhe.huaAuthorUnsubmitted

Done

Ah, goto does throw a wrench in all of this. Specifically, it is problematic only for do ... while loops, where the labeled statement is the first statement of the do block.

I took out the sentence, took out the assert, and added a FIXME for prepareBackEdges.

li.zhe.hua: Ah, `goto` does throw a wrench in all of this. Specifically, it is problematic only for `do ...

static const CFGBlock *findBackEdge(const CFGBlock *Block) {

assert(Block != nullptr);

for (const auto &Pred : Block->preds()) {

if (!Pred.isReachable())

continue;

if (isBackEdge(Pred))

return Pred;

}

return nullptr;

}

ymandelUnsubmitted

Done

Might it be worth simply returning the backedge when you find it? Or is the assertion (above) sufficiently important to keep it as is?

ymandel: Might it be worth simply returning the backedge when you find it? Or is the assertion (above)…

li.zhe.huaAuthorUnsubmitted

Done

So, I couldn't prove to myself that a block couldn't have two backedge predecessors from staring at the CFG code, but it seems conceptually unlikely enough that I'm OK simplifying this.

li.zhe.hua: So, I couldn't prove to myself that a block couldn't have two backedge predecessors from…

/// Computes the input state for a given basic block by joining the output /// Computes the input state for a given basic block by joining the output

/// states of its predecessors. /// states of its predecessors.

/// ///

/// Requirements: /// Requirements:

/// ///

/// All predecessors of `Block` except those with loop back edges must have /// All predecessors of `Block` except those with loop back edges must have

/// already been transferred. States in `BlockStates` that are set to /// already been transferred. States in `BlockStates` that are set to

/// `llvm::None` represent basic blocks that are not evaluated yet. /// `llvm::None` represent basic blocks that are not evaluated yet.

Show All 29 Lines if (Block.getTerminator().isTemporaryDtorsBranch()) {

// See `NoreturnDestructorTest` for concrete examples. // See `NoreturnDestructorTest` for concrete examples.

if (Block.succ_begin()->getReachableBlock()->hasNoReturnElement()) { if (Block.succ_begin()->getReachableBlock()->hasNoReturnElement()) {

auto StmtBlock = CFCtx.getStmtToBlock().find(Block.getTerminatorStmt()); auto StmtBlock = CFCtx.getStmtToBlock().find(Block.getTerminatorStmt());

assert(StmtBlock != CFCtx.getStmtToBlock().end()); assert(StmtBlock != CFCtx.getStmtToBlock().end());

Preds.erase(StmtBlock->getSecond()); Preds.erase(StmtBlock->getSecond());

} }

llvm::Optional<TypeErasedDataflowAnalysisState> MaybeState; llvm::Optional<TypeErasedDataflowAnalysisState> MaybeState;

bool ApplyBuiltinTransfer = Analysis.applyBuiltinTransfer(); bool ApplyBuiltinTransfer = Analysis.applyBuiltinTransfer();

for (const CFGBlock *Pred : Preds) { for (const CFGBlock *Pred : Preds) {

// Skip if the `Block` is unreachable or control flow cannot get past it. // Skip if the `Block` is unreachable or control flow cannot get past it.

if (!Pred || Pred->hasNoReturnElement()) if (!Pred || Pred->hasNoReturnElement())

continue; continue;

xazax.hunUnsubmitted

Not Done

Could you elaborate on this? Let's consider this loop:

Pred
  |
  v
LoopHeader <---BackEdge

Do we ignore the state coming from Pred on purpose? Is that sound?

I would expect the analysis to always compute join(PredState, BackEdgeState), and I would expect the widening to happen between the previous iteration of BackEdgeState and the current iteration of BackEdgeState. So, I wonder if we already invoke the widening operator along back edges, wouldn't the regular logic work just fine? Do I miss something?

xazax.hun: Could you elaborate on this? Let's consider this loop: ``` Pred | v LoopHeader <…

li.zhe.huaAuthorUnsubmitted

Done

Do we ignore the state coming from Pred on purpose? Is that sound?

We don't, and this is what the comment

// For the first iteration of loop, the "zeroth" iteration state is set up by
// `prepareBackEdges`.

failed to explain. After transferring PredState, we copy PredState into BackEdgeState, which is done in prepareBackEdges.

I would expect the analysis to always compute join(PredState, BackEdgeState)

I'm not sure I see that we should always join PredState and BackEdgeState. Execution doesn't go from Pred into the Nth iteration of the loop, it only goes from Pred into the first iteration of the loop, e.g. the predecessor for the 4th iteration of the loop is only the back-edge from the 3rd iteration of the loop, not Pred.

Let me know if this makes sense.

li.zhe.hua: > Do we ignore the state coming from `Pred` on purpose? Is that sound? We don't, and this is…

xazax.hunUnsubmitted

Not Done

I'm not sure I see that we should always join PredState and BackEdgeState. Execution doesn't go from Pred into the Nth iteration of the loop, it only goes from Pred into the first iteration of the loop, e.g. the predecessor for the 4th iteration of the loop is only the back-edge from the 3rd iteration of the loop, not Pred.

Let me know if this makes sense.

The analysis state of the dataflow analysis supposed to overestimate all of the paths. Consider the following loop and imagine we want to calculate intervals for integer variables:

int i = 0;
while (...) {
  [[program point A]];
  ++i;
}

During the analysis of this loop, the value i ==0 flows to [[program point A]]. This is the motivation to join the state from the back edge and from PredState. As far as I understand, you want to achieve this by copying PredState to the BackEdgeState before the first iteration. But this also means that we will use the widening operator to combine PredState with the state after N iterations instead of the regular join operation. I am not entirely sure whether these two approaches always produce the same results.

xazax.hun: > I'm not sure I see that we should always join `PredState` and `BackEdgeState`. Execution…

xazax.hunUnsubmitted

Not Done

A contrived example (just came up with this in a couple minutes so feel free to double check if it is OK):
Consider:

int i = 0;
while (...) {
  if (i == 0)
    i = -2;
  ++i;
}

Some states:

PredState = i : [0, 0]
BackEdgeState (after first iteration) = i : [-1, -1]

And the results of join vs widen:

PredState.join(BackEdgeState) = i : [-1, 0]
PredState.widen(BackEdge) = i : [-inf, 0]

The extra widening with the PredState can result in extra imprecision.

xazax.hun: A contrived example (just came up with this in a couple minutes so feel free to double check if…

li.zhe.huaAuthorUnsubmitted

Done

So, my understanding is that widening introduces imprecision as a trade-off for convergence. Yes, in the contrived example, joining converges after a few iterations, but in other cases it never converges.

Looking at the Rival and Yi book, there are no intervening joins with the predecessor for the first loop as we analyze the loop. This is roughly where I drew from in terms of this implementation.

analysis(iter{p}, a) = { R <- a;
                         repeat
                             T <- R;
                             R <- widen(R, analysis(p, R));
                         until inclusion(R, T)
                         return T;

li.zhe.hua: So, my understanding is that widening introduces imprecision as a trade-off for convergence.

xazax.hunUnsubmitted

Not Done

So, my understanding is that widening introduces imprecision as a trade-off for convergence. Yes, in the contrived example, joining converges after a few iterations, but in other cases it never converges.

In my example I do not advocate for NOT doing widening. I am advocating for not applying the widening operator with the loop pre-state and the loop iteration state as operands. My example demonstrates how that introduces additional imprecision. Not using the widening operator for those two states, only between loop iterations would still ensure convergence, but would give you more precision. My argument is, using widening in that step is not required for the convergence but we do pay for it in precision. Or do you have an example where you need widening for those two specific states?

Looking at the Rival and Yi book, there are no intervening joins with the predecessor for the first loop as we analyze the loop. This is roughly where I drew from in terms of this implementation.

Later in the book, they talk about loop unrolling where we don't use the widening operator in the first couple of iterations. The same imprecision I was talking about here could also be solved by unrolling the first iteration. I believe the book is might not talk about this technique because the unrolling subsumes it.

The main reason I am advocating for this, because I believe if we are willing to accept the join there, we would no longer need to prepare the back edges and it would simplify the implementation while improving the precision of the analysis.

xazax.hun: > So, my understanding is that widening introduces imprecision as a trade-off for convergence.

// Skip if `Pred` was not evaluated yet. This could happen if `Pred` has a // Skip if `Pred` was not evaluated yet. This could happen if `Pred` has a

// loop back edge to `Block`. // a noreturn element.

const llvm::Optional<TypeErasedDataflowAnalysisState> &MaybePredState = const llvm::Optional<TypeErasedDataflowAnalysisState> &MaybePredState =

BlockStates[Pred->getBlockID()]; BlockStates[Pred->getBlockID()];

if (!MaybePredState) if (!MaybePredState)

continue; continue;

TypeErasedDataflowAnalysisState PredState = MaybePredState.value(); TypeErasedDataflowAnalysisState PredState = MaybePredState.value();

if (ApplyBuiltinTransfer) { if (ApplyBuiltinTransfer) {

if (const Stmt *PredTerminatorStmt = Pred->getTerminatorStmt()) { if (const Stmt *PredTerminatorStmt = Pred->getTerminatorStmt()) {

▲ Show 20 Lines • Show All 86 Lines • ▼ Show 20 Lines TypeErasedDataflowAnalysisState transferBlock(

llvm::ArrayRef<llvm::Optional<TypeErasedDataflowAnalysisState>> BlockStates, llvm::ArrayRef<llvm::Optional<TypeErasedDataflowAnalysisState>> BlockStates,

const CFGBlock &Block, const Environment &InitEnv, const CFGBlock &Block, const Environment &InitEnv,

TypeErasedDataflowAnalysis &Analysis, TypeErasedDataflowAnalysis &Analysis,

std::function<void(const CFGStmt &, std::function<void(const CFGStmt &,

const TypeErasedDataflowAnalysisState &)> const TypeErasedDataflowAnalysisState &)>

HandleTransferredStmt) { HandleTransferredStmt) {

TypeErasedDataflowAnalysisState State = TypeErasedDataflowAnalysisState State =

computeBlockInputState(CFCtx, BlockStates, Block, InitEnv, Analysis); computeBlockInputState(CFCtx, BlockStates, Block, InitEnv, Analysis);

for (const CFGElement &Element : Block) { for (const CFGElement &Element : Block) {

switch (Element.getKind()) { switch (Element.getKind()) {

case CFGElement::Statement: case CFGElement::Statement:

transferCFGStmt(CFCtx, BlockStates, *Element.getAs<CFGStmt>(), Analysis, transferCFGStmt(CFCtx, BlockStates, *Element.getAs<CFGStmt>(), Analysis,

State, HandleTransferredStmt); State, HandleTransferredStmt);

break; break;

case CFGElement::Initializer: case CFGElement::Initializer:

if (Analysis.applyBuiltinTransfer()) if (Analysis.applyBuiltinTransfer())

transferCFGInitializer(*Element.getAs<CFGInitializer>(), State); transferCFGInitializer(*Element.getAs<CFGInitializer>(), State);

break; break;

default: default:

// FIXME: Evaluate other kinds of `CFGElement`. // FIXME: Evaluate other kinds of `CFGElement`.

break; break;

} }

// The back edge block is where we perform a widen, allowing certain analyses

// to converge in finite time. The `PrevState` of the previous iteration of

// the loop is widened to subsume the input `State`.

// FIXME: Allow users to configure a certain number of iterations to unroll,

// to allow higher precision in their analyses.

if (isBackEdge(&Block)) {

auto PrevState = BlockStates[Block.getBlockID()];

assert(PrevState.has_value());

Analysis.widenTypeErased(PrevState->Lattice, State.Lattice);

// FIXME: Add a widen operator to the `Environment`.

PrevState->Env.join(State.Env, Analysis);

return *PrevState;

}

return State; return State;

} }

/// Given a `Block` with its state set in `BlockStates`, copies this state to the

/// back edge of all successors that are the first block of a loop.

///

/// FIXME: This is unsound for a corner case where a `goto` jumps to the start

/// of a `do ... while` loop.

static void prepareBackEdges(

const CFGBlock &Block,

llvm::MutableArrayRef<llvm::Optional<TypeErasedDataflowAnalysisState>>

BlockStates) {

const auto &State = BlockStates[Block.getBlockID()];

assert(State.has_value());

for (const auto &Succ : Block.succs()) {

if (!Succ.isReachable())

continue;

if (const CFGBlock *BackEdge = findBackEdge(Succ);

BackEdge != nullptr && BackEdge != &Block) {

// `Succ` is the first block of the loop body, `Block` is the "entrance"

// to the loop, and `BackEdge` is the back-edge of the loop.

// `Block` represents the input state to the loop before any iterations

// have been executed. By copying this state to the back edge, we can

// consistently use the back edge block as the input state to the loop,

// even on its first iteration. This subsequently handles nested loops; if

// the outer loop requires additional passes to converge, the inner loop

// is correctly "reinitialized" once `Block` is transfered.

BlockStates[BackEdge->getBlockID()] = State;

}

llvm::Expected<std::vector<llvm::Optional<TypeErasedDataflowAnalysisState>>> llvm::Expected<std::vector<llvm::Optional<TypeErasedDataflowAnalysisState>>>

runTypeErasedDataflowAnalysis( runTypeErasedDataflowAnalysis(

const ControlFlowContext &CFCtx, TypeErasedDataflowAnalysis &Analysis, const ControlFlowContext &CFCtx, TypeErasedDataflowAnalysis &Analysis,

const Environment &InitEnv, const Environment &InitEnv,

std::function<void(const CFGStmt &, std::function<void(const CFGStmt &,

const TypeErasedDataflowAnalysisState &)> const TypeErasedDataflowAnalysisState &)>

PostVisitStmt) { PostVisitStmt) {

PostOrderCFGView POV(&CFCtx.getCFG()); PostOrderCFGView POV(&CFCtx.getCFG());

▲ Show 20 Lines • Show All 42 Lines • ▼ Show 20 Lines while (const CFGBlock *Block = Worklist.dequeue()) {

BlockStates[Block->getBlockID()] = std::move(NewBlockState); BlockStates[Block->getBlockID()] = std::move(NewBlockState);

// Do not add unreachable successor blocks to `Worklist`. // Do not add unreachable successor blocks to `Worklist`.

if (Block->hasNoReturnElement()) if (Block->hasNoReturnElement())

continue; continue;

Worklist.enqueueSuccessors(Block); Worklist.enqueueSuccessors(Block);

// If we are about to enter into a loop, we set up our back edge blocks.

prepareBackEdges(*Block, BlockStates);

} }

// FIXME: Consider evaluating unreachable basic blocks (those that have a // FIXME: Consider evaluating unreachable basic blocks (those that have a

// state set to `llvm::None` at this point) to also analyze dead code. // state set to `llvm::None` at this point) to also analyze dead code.

if (PostVisitStmt) { if (PostVisitStmt) {

for (const CFGBlock *Block : CFCtx.getCFG()) { for (const CFGBlock *Block : CFCtx.getCFG()) {

// Skip blocks that were not evaluated. // Skip blocks that were not evaluated.

if (!BlockStates[Block->getBlockID()]) if (!BlockStates[Block->getBlockID()])

Show All 12 Lines

clang/unittests/Analysis/FlowSensitive/TypeErasedDataflowAnalysisTest.cpp

Show All 36 Lines
#include <vector>		#include <vector>

namespace {		namespace {

using namespace clang;		using namespace clang;
using namespace dataflow;		using namespace dataflow;
using namespace test;		using namespace test;
using namespace ast_matchers;		using namespace ast_matchers;
		using ::llvm::HasValue;
using ::testing::_;		using ::testing::_;
		using ::testing::AllOf;
		using ::testing::Each;
using ::testing::ElementsAre;		using ::testing::ElementsAre;
using ::testing::IsEmpty;		using ::testing::IsEmpty;
using ::testing::IsNull;
using ::testing::NotNull;		using ::testing::NotNull;
		using ::testing::Optional;
using ::testing::Pair;		using ::testing::Pair;
using ::testing::Ref;		using ::testing::Ref;
		using ::testing::SizeIs;
using ::testing::Test;		using ::testing::Test;
using ::testing::UnorderedElementsAre;		using ::testing::UnorderedElementsAre;

template <typename AnalysisT>		template <typename AnalysisT>
llvm::Expected<std::vector<		llvm::Expected<std::vector<
llvm::Optional<DataflowAnalysisState<typename AnalysisT::Lattice>>>>		llvm::Optional<DataflowAnalysisState<typename AnalysisT::Lattice>>>>
runAnalysis(llvm::StringRef Code, AnalysisT (*MakeAnalysis)(ASTContext &)) {		runAnalysis(llvm::StringRef Code, AnalysisT (*MakeAnalysis)(ASTContext &)) {
std::unique_ptr<ASTUnit> AST =		std::unique_ptr<ASTUnit> AST =
▲ Show 20 Lines • Show All 157 Lines • ▼ Show 20 Lines	std::string Code = R"(
}		}
)";		)";
auto Res = runAnalysis<NonConvergingAnalysis>(		auto Res = runAnalysis<NonConvergingAnalysis>(
Code, [](ASTContext &C) { return NonConvergingAnalysis(C); });		Code, [](ASTContext &C) { return NonConvergingAnalysis(C); });
EXPECT_EQ(llvm::toString(Res.takeError()),		EXPECT_EQ(llvm::toString(Res.takeError()),
"maximum number of iterations reached");		"maximum number of iterations reached");
}		}

		struct ConvergesOnWidenLattice {
		int State = 0;
		bool Top = false;

		bool operator==(const ConvergesOnWidenLattice &Other) const {
		if (Top)
		return Other.Top;
		return State == Other.State;
		}

		LatticeJoinEffect join(const ConvergesOnWidenLattice &Other) {
		auto Prev = *this;
		Top = Top \|\| Other.Top;
		State += Other.State;
		return Prev == *this ? LatticeJoinEffect::Unchanged
		: LatticeJoinEffect::Changed;
		}

		void widen(const ConvergesOnWidenLattice &Other) { Top = true; }

		friend std::ostream &operator<<(std::ostream &OS,
		const ConvergesOnWidenLattice &L) {
		return OS << "{" << L.State << "," << (L.Top ? "true" : "false") << "}";
		}
		};

		class ConvergesOnWidenAnalysis
		: public DataflowAnalysis<ConvergesOnWidenAnalysis,
		ConvergesOnWidenLattice> {
		public:
		explicit ConvergesOnWidenAnalysis(ASTContext &Context)
		: DataflowAnalysis(Context,
		/ApplyBuiltinTransfer=/false) {}

		static ConvergesOnWidenLattice initialElement() { return {}; }

		void transfer(const Stmt *S, ConvergesOnWidenLattice &E, Environment &Env) {
		++E.State;
		}
		};

		TEST(DataflowAnalysisTest, WhileLoopConvergesOnWidenAnalysis) {
		sgatevUnsubmitted Not Done Reply Inline Actions There's already a set of "widening" tests - http://google3/third_party/llvm/llvm-project/clang/unittests/Analysis/FlowSensitive/TypeErasedDataflowAnalysisTest.cpp;l=527-712;rcl=462638952 What do you think about refactoring those so that we have tests that exercise the framework with both `join` and `widen`? sgatev: There's already a set of "widening" tests - http://google3/third_party/llvm/llvm…
		li.zhe.huaAuthorUnsubmitted Done Reply Inline Actions This commit implements widening only for the lattice. Those tests use the `NoopLattice`, so their behavior is not expected to change with this commit. Splitting `merge` into a `join` and `widen` is out-of-scope for this commit, and is what I am hoping to work on next. li.zhe.hua: This commit implements widening only for the lattice. Those tests use the `NoopLattice`, so…
		std::string Code = R"(
		void target() {
		while(true) {}
		}
		)";
		EXPECT_THAT_EXPECTED(
		ymandelUnsubmitted Done Reply Inline Actions Might a (googletest) assertion here be better than `llvm::cantFail`? I would think that this line is the crux of checking whether it converges or not. ymandel: Might a (googletest) assertion here be better than `llvm::cantFail`? I would think that this…
		li.zhe.huaAuthorUnsubmitted Done Reply Inline Actions Ah, yes. I hadn't thought to look for `llvm::Expected` matchers; good call. li.zhe.hua: Ah, yes. I hadn't thought to look for `llvm::Expected` matchers; good call.
		runAnalysis<ConvergesOnWidenAnalysis>(
		Code, [](ASTContext &C) { return ConvergesOnWidenAnalysis(C); }),
		HasValue(AllOf(SizeIs(4), Each(Optional(_)))));
		}

		TEST(DataflowAnalysisTest, ForLoopConvergesOnWidenAnalysis) {
		std::string Code = R"(
		void target() {
		for (int i = 0; i > -1; ++i) {}
		}
		)";
		EXPECT_THAT_EXPECTED(
		runAnalysis<ConvergesOnWidenAnalysis>(
		Code, [](ASTContext &C) { return ConvergesOnWidenAnalysis(C); }),
		HasValue(AllOf(SizeIs(5), Each(Optional(_)))));
		}

struct FunctionCallLattice {		struct FunctionCallLattice {
llvm::SmallSet<std::string, 8> CalledFunctions;		llvm::SmallSet<std::string, 8> CalledFunctions;

bool operator==(const FunctionCallLattice &Other) const {		bool operator==(const FunctionCallLattice &Other) const {
return CalledFunctions == Other.CalledFunctions;		return CalledFunctions == Other.CalledFunctions;
}		}

LatticeJoinEffect join(const FunctionCallLattice &Other) {		LatticeJoinEffect join(const FunctionCallLattice &Other) {
▲ Show 20 Lines • Show All 1,040 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[clang][dataflow] Restructure loops to call widen on back edgesAcceptedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 454089

clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp

clang/unittests/Analysis/FlowSensitive/TypeErasedDataflowAnalysisTest.cpp

[clang][dataflow] Restructure loops to call widen on back edges
AcceptedPublic