This is an archive of the discontinued LLVM Phabricator instance.

[clang][CFG] Add support for partitioning CFG into intervals.
ClosedPublic

Authored by ymandel on Jun 6 2023, 6:29 AM.

Details

Summary

Adds support for the classic dataflow algorithm that partitions a flow graph
into distinct intervals. C.f. Dragon book, pp. 664-666.

A version of this algorithm exists in LLVM (see llvm/Analysis/Interval.h and
related files), but it is specific to LLVM, is a recursive (vs iterative)
algorithm, and uses many layers of abstraction that seem unnecessary for CFG
purposes.

This patch is part 1 of N. Subsequent patches will generalize the code to work
on intervals to support computation of the limit flow graph.

Diff Detail

Event Timeline

ymandel created this revision.Jun 6 2023, 6:29 AM
Herald added a project: Restricted Project. · View Herald Transcript
ymandel requested review of this revision.Jun 6 2023, 6:29 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 6 2023, 6:29 AM
ymandel updated this revision to Diff 528845.Jun 6 2023, 7:00 AM

fix formatting in header file

xazax.hun added inline comments.Jun 6 2023, 12:45 PM
clang/include/clang/Analysis/Analyses/IntervalPartition.h
22

A concise definition of what is an interval with a reference to the dragon book might be useful here.

ymandel updated this revision to Diff 531702.Jun 15 2023, 4:58 AM

Expand comments

ymandel marked an inline comment as done.Jun 15 2023, 4:58 AM
xazax.hun added inline comments.Jun 18 2023, 12:10 PM
clang/include/clang/Analysis/Analyses/IntervalPartition.h
36

Nit: I wonder if we want something like llvm::DenseSet when we use smaller types like pointers. Same for Successors.

clang/lib/Analysis/IntervalPartition.cpp
27

Is it possible we end up adding the same node to the queue multiple times? Is that desirable or do we want to make sure we only have each node at most once?

38

Same question here, is it possible we might end up adding the same nodes multiple times?

47–48

I wonder if this approach is correct. Consider the following scenario:

  A
 / \
B   C
|   |
|   D
 \ /
  E

In the BFS, we might visit: ABCED. Since we visit E before D, we might not recognize that E is part of the interval. Do I miss something?

ymandel updated this revision to Diff 533641.Jun 22 2023, 9:08 AM
ymandel marked 2 inline comments as done.

responded to comments

ymandel marked 2 inline comments as not done.Jun 22 2023, 9:09 AM
ymandel added inline comments.
clang/lib/Analysis/IntervalPartition.cpp
27

Added an answer in comments, but repeating here in case you want to discuss further:

// The worklist *may* contain duplicates. We guard against this possibility by
// checking each popped element for completion (that is, presence in
// `Partitioned`). We choose this approach over using a set because the queue
// allows us to flexibly insert and delete elements while iterating through
// the list. A set would require two separate phases for iterating and
// mutation.
38

Added an answer in comments, but repeating here in case you want to discuss further:

// It may contain duplicates -- ultimately, all relevant elements
// are added to `Interval.Successors`, which is a set.
47–48

I wonder if this approach is correct. Consider the following scenario:

  A
 / \
B   C
|   |
|   D
 \ /
  E

In the BFS, we might visit: ABCED. Since we visit E before D, we might not recognize that E is part of the interval. Do I miss something?

When we add D to the interval, we'll push E onto the queue again (lines 58-59). The second time that E is considered it will have both successors in the interval and will be added as well.

xazax.hun accepted this revision.Jun 23 2023, 1:29 AM
xazax.hun added inline comments.
clang/lib/Analysis/IntervalPartition.cpp
27

I see, thanks! This addresses my concerns. I think in some cases we use a bitset with the blocks' ids to more efficiently track things like that.

47–48

Ah, I see, makse sense, thanks! This also makes me wonder, would this algorithm be more efficient if we used RPO (in case we can use that without recalculating the order)? :D But we do not need to benchmark this for the first version.

This revision is now accepted and ready to land.Jun 23 2023, 1:29 AM
ymandel updated this revision to Diff 535002.Jun 27 2023, 8:43 AM

Respond to comments

ymandel marked 3 inline comments as done.Jun 27 2023, 8:49 AM
ymandel added inline comments.
clang/lib/Analysis/IntervalPartition.cpp
27

Thanks. I experimented with that and added some use of bitsets here (llvm::BitVector). I didn't go all out though since the CFG doesn't provide a reverse mapping from block ID to block pointer, so I still used SmallDenseSet where we'll need the block pointers.

47–48

I suspect it would be though I think that any traversal based ordering vs. random will still be pretty good, especially since the queue already pushes us towards breadth-first. I didn't use RPO because I figured that the cost of computing it would be greater than any potential benefit.

xazax.hun accepted this revision.Jun 27 2023, 9:14 AM

Thanks!

This revision was landed with ongoing or failed builds.Jun 27 2023, 10:08 AM
This revision was automatically updated to reflect the committed changes.
ymandel marked 2 inline comments as done.