This patch adds support for building a weak topological ordering of the CFG
blocks, based on a limit flow graph constructed via (repeated) interval
partitioning of the CFG.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Could you rebase this patch to the latest tip of tree?
clang/include/clang/Analysis/Analyses/IntervalPartition.h | ||
---|---|---|
76 | Is feedback edge a special terminology? I think we might simply call these back edges most of the time. | |
86 | typo? | |
90 | same typo here? | |
clang/lib/Analysis/IntervalPartition.cpp | ||
23–51 | What is the reason for putting a using statement in an anonymous namespace? |
Rebased, thanks!
clang/include/clang/Analysis/Analyses/IntervalPartition.h | ||
---|---|---|
76 | That's the language from the paper but I prefer standard "back edge" terminology so updated to reflect. | |
clang/lib/Analysis/IntervalPartition.cpp | ||
23–51 | I... don't remember what I was thinking. Regardless, seems pointless for a template -- fixed. |
clang/unittests/Analysis/IntervalPartitionTest.cpp | ||
---|---|---|
21 | Another redundant anonymous namespace here. |
clang/lib/Analysis/IntervalPartition.cpp | ||
---|---|---|
121 | It would probably take for me some time to understand what is going on here, but I think this part might worth a comment. In particular, I have the impression that Graph.Intervals is the owner of all the intervals. But we use pointers to refer to these intervals. What would guarantee that those pointers are not being invalidated by this emplace_back operations? |
clang/lib/Analysis/IntervalPartition.cpp | ||
---|---|---|
121 | Excellent question, not least because I *wasn't* careful the first around and indeed ran up against this as a memory bug. :) That's the reason I added IDs, so that I could _avoid_ taking the addresses of elements until after we finish growing the vector. See lines 148-165: after we finish building the intervals, we go through and take addresses. I can add comments to this effect, but perhaps the safe option is to use a std::dequeue. What do you think? |
clang/include/clang/Analysis/Analyses/IntervalPartition.h | ||
---|---|---|
56–64 | Correct me if I'm wrong but I have the impression that CFGInterval might either contain only CfgBlocks or other CfgIntervals, but these might never be mixed. The current representation does allow such mixing. In case do not need that, I think it might be cleaner to make CfgInterval itself templated and remove the std::variant. | |
101–103 | I wonder if we still want to somehow enforce that within a loop/interval the we visit nodes in the RPO order. | |
clang/lib/Analysis/IntervalPartition.cpp | ||
36 | We have some convenience wrappers in LLVM: https://llvm.org/doxygen/namespacellvm.html#a086e9fbdb06276db7753101a08a63adf | |
121 | Oh, thanks! I don't have a strong preference between a comment or a deque. |
Having (mostly!) understood what this is doing, the algorithm looks like it's doing the right thing.
High-level ideas would be:
- there are three different representations/sets of APIs exposed: intervals, ordering, and compare. It seems like we only have immediate plans to use compare, and maybe ordering in the future. Maybe we could hide the stuff we're not using, if it's necessary to test it then a narrower interface for testing might be enough? Like string printIntervalGraph(CFG, int iters) and then all of Interval could be hidden.
- the data structures for the interval graph are mechanically complicated (trees with differently templated node types etc), and dealing with this incidental complexity and the algorithm complexity is a lot. It may be possible to use simpler data structures, as we don't make use of everything in these ones.
clang/include/clang/Analysis/Analyses/IntervalPartition.h | ||
---|---|---|
38 | Summarizing an offline discussion, IIUC...
I think it's worth documenting this motivation, and also considering not paying this complexity cost for a feature we aren't adding yet. | |
56 | (Very optional idea, in case you like it) This data structure is complicated: a tree of Intervals with CFGBlocks at the root, and at every level the Intervals have pred/succ edges that form part of one of the intermediate flow graphs. It seems nice to get rid of the tree if possible (ultimately we're using these for ordering) and the nested pred/succs - we seem to only need these for the top level interval (and in the end, we have one interval and pred/succs are empty. What if we made this a flat structure, preserving the bracket structure as some kind of back-edge? 1 ( 2 ( 3 4 ) ) 5 becomes 1 2 3 4 !3 !2 5 AIUI, the dataflow algorithm can basically treat this as a program, we move left to right and !X means "if (X is invalidated) goto X". This would give us something like: using Entry = PointerIntPair<CFGBlock*, 1>; // either an actual block, or a conditional jump struct Interval { vector<Entry> Sequence; SmallDenseSet<Interval*> Preds, Succs; } using IntervalGraph = vector<Interval>; I also suspect that this would let us avoid the core algorithms (buildInterval, addIntervalNode) being templated, simply wrap each CFGBlock in a trivial Interval to start with instead. I guess this is OK performance-wise, but the reason not to do it is ending up with those nodes polluting your final data structure? Anyway, all this is up to you - for me keeping more data/structure around than we're going to use is confusing, but it does seem closer to the textbook. | |
123 | Naming is confusing here: getWTO is the internal function whose API involves intervals, and getIntervalWTO be the main function whose API doesn't. I think literally swapping them might be clearer! in general it's hard to tell which parts of this API people are actually meant to use. | |
126 | consider instead exposing printWTO as operator<<. Then you get formatWTO for free as llvm::to_string(), WeakTopologicalOrdering prints reasonably with gtest, you can log it directly, etc. | |
clang/lib/Analysis/IntervalPartition.cpp | ||
30 | remove commented out code | |
36 | here specifically llvm::is_contained(Interval, N) can be used directly, this function is not needed | |
55 | I'm confused, here and below why are successors allowed to be nullptr? | |
80 | consider bool AllInInterval = llvm::all_of(B->preds(), [&](const Node *P) { return inInterval(P, Interval.Nodes); }); (usually i prefer the plain loop, but tracking booleans is annoying) | |
83 | you always add to MaybeSuccessors this exactly once when AllInInterval ends up false, but that's not totally obvious. Consider moving this below the loop, to the else in if (AllInInterval) |
Thanks for the very helpful feedback!
I think I've addressed all of the suggestions/concerns. I suspect there's yet more room for improvement in the algorithm, but I think we're converging on "good enough".
clang/include/clang/Analysis/Analyses/IntervalPartition.h | ||
---|---|---|
38 | Sam, I've radically simplified the implementation to only build exactly what's needed -- the ordering. PTAL and let me know what further comments/documentation would help. | |
56 | I went with this proposal and stripped it down to only track what we need. I didn't even preserve the backedge structure, because we can't use that yet. If and when we get there, I think you're approach or something similar, where we avoid the tree structure, is a great idea. | |
56–64 | True, but the Header field gets in the way, since it needs to have a recursive type (it's type is always one level down from the interval's type). So, somewhere you'll need a variant and a recursive type. The current version avoids this by dropping Header entirely, since it isn't needed in the output. | |
101–103 | Starting with RPO should make this algorithm cheaper (and maybe simpler!). But, RPO construction isn't free. So, I think it is a matter of cost -- compare generating the RPO first and then using that here vs. just using this directly. I can put a note in the implementation as FIXME. WDYT? | |
123 | Agreed. The only reason to expose is for tests. | |
clang/lib/Analysis/IntervalPartition.cpp | ||
55 | That's how the CFG encodes unreachable nodes. succs doesn't actually return CFGBlocks, but an intermediate type which encodes reachability info. When you (implicitly) cast it to CFGBlock*, unreachable nodes become a nullptr. | |
121 | I ended up changing to a deque after almost making the same mistake again. vector is just not safe when you want to be taking addresses. :) I've dropped the use of IDs and added comments. Let me know if this clear enough. |
clang/include/clang/Analysis/Analyses/IntervalPartition.h | ||
---|---|---|
101–103 | Sounds good to me. | |
131 | Would it make sense to have a vector instead and use block ids to get the order? | |
clang/lib/Analysis/IntervalPartition.cpp | ||
121 | Looks good, thanks! I agree that the deque solution looks cleaner, probably we should get it first right and we can optimize later if there is a need for that. |
Addressed comments.
clang/include/clang/Analysis/Analyses/IntervalPartition.h | ||
---|---|---|
101–103 | Added a FIXME in buildIntervals. |
Thanks, this is much simpler!
Just nits & apologies for the delay.
clang/include/clang/Analysis/Analyses/IntervalPartition.h | ||
---|---|---|
24 | some of these added headers are now unused: map, set, variant | |
31 | nit: namespace internal should probably be moved below the public API | |
113 | nit: nullopt | |
113 | what do we expect to do in practice when the CFG is not reducible? or do we expect that to never happen? (mostly wondering if we need a fallback) | |
clang/lib/Analysis/IntervalPartition.cpp | ||
62 | Might be more useful to say concretely: Nodes are either CFGBlock or CFGIntervalNode | |
124 | std::map of pointers is a bit suspicious, densemap? | |
clang/unittests/Analysis/IntervalPartitionTest.cpp | ||
134 | nit: matcher factories are functions and so should be lowerCamelCase |
Thanks for the helpful suggestions!
clang/include/clang/Analysis/Analyses/IntervalPartition.h | ||
---|---|---|
113 | good question -- I'm not really sure what to expect. In practice, any structured function will yield a reducible graph. But, creative use of goto's can undermine that. So, my guess is that we will not encounter these in practice, but I don't know for sure. As for fallback, RPO doesn't rely on reducibility. So, clients are free to use that instead when this fails. | |
clang/lib/Analysis/IntervalPartition.cpp | ||
124 | I went with a vector mapping (implicitly) from Node IDs, like I did for the WTO. |
some of these added headers are now unused: map, set, variant