This is an archive of the discontinued LLVM Phabricator instance.

[clang][CFG] Support construction of a weak topological ordering of the CFG.
ClosedPublic

Authored by ymandel on Jun 15 2023, 11:15 AM.

Details

Summary

This patch adds support for building a weak topological ordering of the CFG
blocks, based on a limit flow graph constructed via (repeated) interval
partitioning of the CFG.

Diff Detail

Event Timeline

ymandel created this revision.Jun 15 2023, 11:15 AM
Herald added a project: Restricted Project. · View Herald Transcript
ymandel requested review of this revision.Jun 15 2023, 11:15 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 15 2023, 11:15 AM
hliao added a subscriber: hliao.Jun 15 2023, 11:16 AM
ymandel updated this revision to Diff 531843.Jun 15 2023, 11:19 AM

fix some comments.

ymandel updated this revision to Diff 531865.Jun 15 2023, 12:09 PM

Fix build break and add some field comments.

ymandel updated this revision to Diff 531874.Jun 15 2023, 12:18 PM

more comment expansion.

Could you rebase this patch to the latest tip of tree?

clang/include/clang/Analysis/Analyses/IntervalPartition.h
112

typo?

113

Is feedback edge a special terminology? I think we might simply call these back edges most of the time.

116

same typo here?

clang/lib/Analysis/IntervalPartition.cpp
23–48

What is the reason for putting a using statement in an anonymous namespace?

ymandel updated this revision to Diff 535397.Jun 28 2023, 7:24 AM

Rebase and address some comments.

ymandel updated this revision to Diff 535402.Jun 28 2023, 7:28 AM

Address more comments.

ymandel marked 4 inline comments as done.Jun 28 2023, 7:30 AM

Rebased, thanks!

clang/include/clang/Analysis/Analyses/IntervalPartition.h
113

That's the language from the paper but I prefer standard "back edge" terminology so updated to reflect.

clang/lib/Analysis/IntervalPartition.cpp
23–48

I... don't remember what I was thinking. Regardless, seems pointless for a template -- fixed.

xazax.hun added inline comments.Jun 28 2023, 8:47 AM
clang/unittests/Analysis/IntervalPartitionTest.cpp
21

Another redundant anonymous namespace here.

xazax.hun added inline comments.Jun 28 2023, 9:49 AM
clang/lib/Analysis/IntervalPartition.cpp
115

It would probably take for me some time to understand what is going on here, but I think this part might worth a comment. In particular, I have the impression that Graph.Intervals is the owner of all the intervals. But we use pointers to refer to these intervals. What would guarantee that those pointers are not being invalidated by this emplace_back operations?

ymandel marked 2 inline comments as done.Jun 28 2023, 9:59 AM
ymandel added inline comments.
clang/lib/Analysis/IntervalPartition.cpp
115

Excellent question, not least because I *wasn't* careful the first around and indeed ran up against this as a memory bug. :) That's the reason I added IDs, so that I could _avoid_ taking the addresses of elements until after we finish growing the vector. See lines 148-165: after we finish building the intervals, we go through and take addresses.

I can add comments to this effect, but perhaps the safe option is to use a std::dequeue. What do you think?

xazax.hun added inline comments.Jul 5 2023, 2:58 AM
clang/include/clang/Analysis/Analyses/IntervalPartition.h
93–101

Correct me if I'm wrong but I have the impression that CFGInterval might either contain only CfgBlocks or other CfgIntervals, but these might never be mixed. The current representation does allow such mixing. In case do not need that, I think it might be cleaner to make CfgInterval itself templated and remove the std::variant.

127–129

I wonder if we still want to somehow enforce that within a loop/interval the we visit nodes in the RPO order.

clang/lib/Analysis/IntervalPartition.cpp
36
115

Oh, thanks! I don't have a strong preference between a comment or a deque.

Having (mostly!) understood what this is doing, the algorithm looks like it's doing the right thing.

High-level ideas would be:

  • there are three different representations/sets of APIs exposed: intervals, ordering, and compare. It seems like we only have immediate plans to use compare, and maybe ordering in the future. Maybe we could hide the stuff we're not using, if it's necessary to test it then a narrower interface for testing might be enough? Like string printIntervalGraph(CFG, int iters) and then all of Interval could be hidden.
  • the data structures for the interval graph are mechanically complicated (trees with differently templated node types etc), and dealing with this incidental complexity and the algorithm complexity is a lot. It may be possible to use simpler data structures, as we don't make use of everything in these ones.
clang/include/clang/Analysis/Analyses/IntervalPartition.h
68–69

Summarizing an offline discussion, IIUC...

  • we build a tree of intervals that represents the whole sequence of interval graphs
  • for the current worklist algorithm, we use only the total order and process the first invalid block, the tree is not needed
  • the algorithm/data structure could be significantly simplified (no variants, templates, recursion...) if it didn't build the tree, just the interval graphs iteratively until a single node is reached
  • however a different, non-worklist algorithm could use the tree information to determine which blocks to visit next. (Given a (b c d) e, after d we try only b or e). This avoids expensive "is invalid" checks
  • so the motivation for preserving the full interval tree is future-proofing for using a different algorithm

I think it's worth documenting this motivation, and also considering not paying this complexity cost for a feature we aren't adding yet.

93

(Very optional idea, in case you like it)

This data structure is complicated: a tree of Intervals with CFGBlocks at the root, and at every level the Intervals have pred/succ edges that form part of one of the intermediate flow graphs.

It seems nice to get rid of the tree if possible (ultimately we're using these for ordering) and the nested pred/succs - we seem to only need these for the top level interval (and in the end, we have one interval and pred/succs are empty.

What if we made this a flat structure, preserving the bracket structure as some kind of back-edge?

1  (  2  (  3  4  )  )  5
becomes
1     2     3  4 !3 !2  5

AIUI, the dataflow algorithm can basically treat this as a program, we move left to right and !X means "if (X is invalidated) goto X".

This would give us something like:

using Entry = PointerIntPair<CFGBlock*, 1>; // either an actual block, or a conditional jump
struct Interval { vector<Entry> Sequence; SmallDenseSet<Interval*> Preds, Succs; }
using IntervalGraph = vector<Interval>;

I also suspect that this would let us avoid the core algorithms (buildInterval, addIntervalNode) being templated, simply wrap each CFGBlock in a trivial Interval to start with instead. I guess this is OK performance-wise, but the reason not to do it is ending up with those nodes polluting your final data structure?

Anyway, all this is up to you - for me keeping more data/structure around than we're going to use is confusing, but it does seem closer to the textbook.

149

Naming is confusing here: getWTO is the internal function whose API involves intervals, and getIntervalWTO be the main function whose API doesn't. I think literally swapping them might be clearer!

in general it's hard to tell which parts of this API people are actually meant to use.
Is it reasonable to wrap everything up to WeakTopologicalOrdering in namespace internal or so? Or are there places this will be used outside tests?

152

consider instead exposing printWTO as operator<<.

Then you get formatWTO for free as llvm::to_string(), WeakTopologicalOrdering prints reasonably with gtest, you can log it directly, etc.

clang/lib/Analysis/IntervalPartition.cpp
30

remove commented out code

36

here specifically llvm::is_contained(Interval, N) can be used directly, this function is not needed

52

I'm confused, here and below why are successors allowed to be nullptr?

76–77

consider

bool AllInInterval = llvm::all_of(B->preds(), [&](const Node *P) {
  return inInterval(P, Interval.Nodes);
});

(usually i prefer the plain loop, but tracking booleans is annoying)

76–77

you always add to MaybeSuccessors this exactly once when AllInInterval ends up false, but that's not totally obvious.

Consider moving this below the loop, to the else in if (AllInInterval)

ymandel updated this revision to Diff 541141.Jul 17 2023, 10:55 AM

Radically simplify the code, per comments.

ymandel updated this revision to Diff 541153.Jul 17 2023, 11:20 AM
ymandel marked 12 inline comments as done.

tweaks

Thanks for the very helpful feedback!
I think I've addressed all of the suggestions/concerns. I suspect there's yet more room for improvement in the algorithm, but I think we're converging on "good enough".

clang/include/clang/Analysis/Analyses/IntervalPartition.h
68–69

Sam, I've radically simplified the implementation to only build exactly what's needed -- the ordering. PTAL and let me know what further comments/documentation would help.

93

I went with this proposal and stripped it down to only track what we need. I didn't even preserve the backedge structure, because we can't use that yet. If and when we get there, I think you're approach or something similar, where we avoid the tree structure, is a great idea.

93–101

True, but the Header field gets in the way, since it needs to have a recursive type (it's type is always one level down from the interval's type). So, somewhere you'll need a variant and a recursive type. The current version avoids this by dropping Header entirely, since it isn't needed in the output.

127–129

Starting with RPO should make this algorithm cheaper (and maybe simpler!). But, RPO construction isn't free. So, I think it is a matter of cost -- compare generating the RPO first and then using that here vs. just using this directly. I can put a note in the implementation as FIXME. WDYT?

149

Agreed. The only reason to expose is for tests.

clang/lib/Analysis/IntervalPartition.cpp
52

That's how the CFG encodes unreachable nodes. succs doesn't actually return CFGBlocks, but an intermediate type which encodes reachability info. When you (implicitly) cast it to CFGBlock*, unreachable nodes become a nullptr.

115

I ended up changing to a deque after almost making the same mistake again. vector is just not safe when you want to be taking addresses. :) I've dropped the use of IDs and added comments. Let me know if this clear enough.

xazax.hun added inline comments.Jul 18 2023, 7:46 AM
clang/include/clang/Analysis/Analyses/IntervalPartition.h
127–129

Sounds good to me.

157

Would it make sense to have a vector instead and use block ids to get the order?

clang/lib/Analysis/IntervalPartition.cpp
115

Looks good, thanks! I agree that the deque solution looks cleaner, probably we should get it first right and we can optimize later if there is a need for that.

ymandel updated this revision to Diff 543552.Jul 24 2023, 7:52 AM
ymandel marked 2 inline comments as done.

Addressed comments.

clang/include/clang/Analysis/Analyses/IntervalPartition.h
127–129

Added a FIXME in buildIntervals.

xazax.hun accepted this revision.Jul 24 2023, 8:42 AM

Thanks, looks good to me!

This revision is now accepted and ready to land.Jul 24 2023, 8:42 AM
sammccall accepted this revision.Jul 26 2023, 1:42 PM

Thanks, this is much simpler!
Just nits & apologies for the delay.

clang/include/clang/Analysis/Analyses/IntervalPartition.h
24

some of these added headers are now unused: map, set, variant

51–67

nit: namespace internal should probably be moved below the public API

139

nit: nullopt

139

what do we expect to do in practice when the CFG is not reducible? or do we expect that to never happen?

(mostly wondering if we need a fallback)

clang/lib/Analysis/IntervalPartition.cpp
36

Might be more useful to say concretely: Nodes are either CFGBlock or CFGIntervalNode

105

std::map of pointers is a bit suspicious, densemap?

clang/unittests/Analysis/IntervalPartitionTest.cpp
91

nit: matcher factories are functions and so should be lowerCamelCase

ymandel updated this revision to Diff 544551.Jul 26 2023, 4:23 PM
ymandel marked 2 inline comments as done.

Addressed comments

ymandel marked 5 inline comments as done.Jul 26 2023, 4:24 PM

Thanks for the helpful suggestions!

clang/include/clang/Analysis/Analyses/IntervalPartition.h
139

good question -- I'm not really sure what to expect. In practice, any structured function will yield a reducible graph. But, creative use of goto's can undermine that. So, my guess is that we will not encounter these in practice, but I don't know for sure.

As for fallback, RPO doesn't rely on reducibility. So, clients are free to use that instead when this fails.

clang/lib/Analysis/IntervalPartition.cpp
105

I went with a vector mapping (implicitly) from Node IDs, like I did for the WTO.

This revision was landed with ongoing or failed builds.Jul 27 2023, 11:57 AM
This revision was automatically updated to reflect the committed changes.
ymandel marked an inline comment as done.