This is an archive of the discontinued LLVM Phabricator instance.

[Docs] Improve cycle and closed path definitions
ClosedPublic

Authored by jsilvanus on Aug 1 2022, 2:48 AM.

Details

Summary

Improve the cycle definition, by avoiding usage of not yet defined
or only vaguely defined terminology inside definitions.
More precisely, the existing definition defined "outermost cycles",
and then proceeded to use the term "cycles" for further definitions,
which in turn were used to actually define "cycles".

Now, instead only define "cycles". This does not change the meaning
of a cycle, which depends on the chosen surrounding (subgraph) of a CFG.

Also mention the function CFG in the first definition, because later
later definitions require it anyways.

Also slightly improve the definition of a closed path, by explicitly
requiring the inner nodes to be distinct.

Diff Detail

Event Timeline

jsilvanus created this revision.Aug 1 2022, 2:48 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 1 2022, 2:48 AM
jsilvanus requested review of this revision.Aug 1 2022, 2:48 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 1 2022, 2:48 AM
jsilvanus updated this revision to Diff 448975.Aug 1 2022, 2:51 AM

Improve definition highlighting.

sameerds accepted this revision.Aug 1 2022, 3:38 AM

I agree, this is cleaner than the original. But please do wait for comments from the other reviewers!

This revision is now accepted and ready to land.Aug 1 2022, 3:38 AM
nhaehnle accepted this revision.Aug 2 2022, 4:19 AM

The summary/commit message has a "later later". LGTM apart from that.

This revision was landed with ongoing or failed builds.Aug 3 2022, 1:30 AM
This revision was automatically updated to reflect the committed changes.
foad added a subscriber: foad.Aug 3 2022, 3:31 AM
foad added inline comments.
llvm/docs/CycleTerminology.rst
28

Would it make sense to swap 3 and 4, so you can define the "top-level cycles" without even mentioning DFS?

jsilvanus added inline comments.Aug 4 2022, 1:42 AM
llvm/docs/CycleTerminology.rst
28

You mean one could define "top-level cycles" without mentioning a particular DFS choice?

The current order has the advantage that entries and headers are adjacent, resembling their close relationship.
But I do not have a strong opinion here.

jsilvanus marked an inline comment as not done.Aug 4 2022, 1:42 AM
foad added inline comments.Aug 22 2022, 7:56 AM
llvm/docs/CycleTerminology.rst
28

You mean one could define "top-level cycles" without mentioning a particular DFS choice?

Yes. The sentence "In any depth-first search starting from the entry, the set of cycles found in the CFG is the same" is really strange. Why would you even mention DFS here??? It should be clear from (1) that cycles have nothing to do with any DFS ordering. It's like saying "the set of cycles found in the CFG is the same regardless of whether Denmark and Canada share a land border" :)

I guess the only reason that (4) mentions a DFS is that (3) has already mentioned DFS. That's why I suggested reordering them. But perhaps we could keep the order the same and still remove DFS from (4)?

jsilvanus added inline comments.Aug 23 2022, 5:43 AM
llvm/docs/CycleTerminology.rst
28

Ah, now I understand, and I think you are right.

The current wording probably originates from something like "The set of top-level cycles (i.e. cycles in the original graph) can be found by DFS. In particular, the set of cycles found this way does not depend on the DFS choice."

What about moving 4 to before 2 (to keep 2 and 3 together), and rephrasing to:

The set of cycles in the function CFG itself (and not a nontrivial subgraph of it) are called *top-level cycles*.
The top-level cycles can be found using a DFS from the entry node.

sameerds added inline comments.Aug 24 2022, 9:50 PM
llvm/docs/CycleTerminology.rst
28

We could make it simpler by not focusing so much on "any DFS". What if that part is moved to section titled "Effect of DFS on Cycle Hierarchy"? Under that, we can now say that the top-level cycles are always the same, but the choice of header results in different hierarchies inside the top-level cycles.

This effect on DFS is not necessary to define how cycles are found. So this enumerated definition can just say DFS and "first node encountered is the header". The choice of header is important for its effect on the users of the cycles found. Having it in a separate section with its own anchor will make it very convenient to add refs in other places like the uniformity info document.