This essentially builds a more normal call graph as a subgraph of the
"reference graph" that was the old model. This allows both to exist and
the different use cases to use the aspect which addresses their needs.
Specifically, the pass manager and other *ordering* constrained logic
can use the reference graph to achieve conservative order of visit,
while analyses reasoning about attributes and other properties derived
from reachability can reason about the direct call graph.
Note that this isn't yet complete: it doesn't model edges to
declarations or indirect calls yet. Those are planned for subsequent
patches to complete the set of information needed for traditional call
graph based analyses.
An important realization is that the call graph is a formal subset of
the reference graph and thus both can live within the same
data structure. All SCCs of the call graph are necessarily contained
within an SCC of the reference graph, etc.
The design is to build 'RefSCC's to model SCCs of the reference graph,
and then within them more literal SCCs for the call graph.
The formation of call graph SCCs is not done lazily, unlike reference
SCCs. Instead, once a reference SCC is formed, it directly builds the
call SCCs within it and stores them in post-order. This is used to
provide a consistent platform for mutation and update of the graph. The
post-order also allows for very efficient updates in common cases by
bounding the number of nodes (and thus edges) considered.
There is considerable common code that I'm still looking for the best
way to factor out between the various DFS implementations here. So far,
my attempts have made the code harder to read and understand despite
reducing the duplication, which seems a poor tradeoff. I've not given up
on figuring out the right way to do this, but I wanted to wait until
I at least had the system working and tested to continue attempting to
factor it differently.
This also requires introducing several new algorithms in order to handle
all of the incremental update scenarios for the more complex structure
involving two edge colorings. I've tried to comment the algorithms
sufficiently to make it clear how this is expected to work, but they may
still need more extensive documentation.
I have worked through the core algorithms on a whiteboard and I think
their underpinnings are largely correct. However, I still have a lot of
testing to do here. Several of the previous tests have not yet been
updated (and are commented out currently) and new tests have not been
written covering all of the intricasies of the new algorithms. I'm
planning to continue working on the testing, but based on discussions
with Sanjoy, it seemed worthwhile to start the review early. This seems
especially worthwhile considering the very sizable amount of code change
due to introducing both new structures and new algorithms.
I also know that there are some changes which are not strictly
necessarily coupled here. The process of developing this started out
with a very focused set of changes for the new structure of the graph
and algorithms, but subsequent changes to bring the APIs and code into
consistent and understandable patterns also ended up touching on other
aspects. There was no good way to separate these out without causing
*massive* merge conflicts. Ultimately, to a large degree this is
a rewrite of most of the core algorithms in the LCG class and so I don't
think it really matters much.
Anyways, looking forward to comments. I'll also update this as I make
more progress carefully testing the rest of the mutation logic.