This is an archive of the discontinued LLVM Phabricator instance.

[SCCIterator] Check if a SCC is a natural loop.
AbandonedPublic

Authored by baziotis on Feb 23 2020, 11:28 AM.

Details

Summary
  • Get LoopInfo for the function.
  • Given an SCC, take one random block of it. Then, call getLoopFor() with

that block.
If we don't get a loop back, then this SCC is definitely not a loop.

  • Otherwise, we compare the size of the SCC with that of the loop and there

are 3 cases:

  • They're equal: this SCC is a loop.
  • The SCC is smaller: This SCC is not a loop because getLoopFor() gives

the innermost loop. So, if this is smaller, it is an SCC inside a loop.

  • The SCC is bigger: In that case, we can't decide since let's say the

SCC has blocks A, B, C. And the loop has blocks A, B. But blocks A, B, C
might also make a loop. However, since getLoopFor() gives us the
innermost loop, it will give us A, B. So, in that case, use
getParentLoop() repeatedly and repeat the procedure for the parent loops.

Diff Detail

Event Timeline

baziotis created this revision.Feb 23 2020, 11:28 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 23 2020, 11:28 AM

First of all, do you think this is a useful addition to the SCC interface? We recently needed it in a patch.

Second, I currently don't know how to get LoopInfo. There's also a problem that it is not sure that we'll be able
to get LoopInfo and in that case, what do we return ? (i.e. with this interface there's no way to signify an error).

Third, do you think there's a better way ? I was thinking that maybe there was a
more straightforward way by trying to verify whether the criteria for a natural
loop are satisfied (i.e. there's a block that dominates all the other
blocks).

I'm not sure this fits here, this is very specific to GraphT=Function*, since LoopInfo::getLoopFor() is only defined for (function) basic blocks.

"ADT" stands for "Abstract Data Type". This use case is rather concrete. Unfortunately, LoopInfo ignores non-natural loops, but it would be the place for this.

IIUC, there will be an SCC only for an topmost loop including all its nested loop. Its not possible using SCCIterator to match a non-topmost loops.

I'm not sure this fits here, this is very specific to GraphT=Function*, since LoopInfo::getLoopFor() is only defined for (function) basic blocks.

Yes, that's a problem. Where else could it be placed?
Ideally, I would like to have something that does not depend on LoopInfo, but I don't. :/

baziotis added a comment.EditedFeb 23 2020, 12:06 PM

"ADT" stands for "Abstract Data Type". This use case is rather concrete. Unfortunately, LoopInfo ignores non-natural loops, but it would be the place for this.

So, a function that gets an SCC and decides if it's a loop, alright.
Edit: Meaning, member function of LoopInfo.

IIUC, there will be an SCC only for an topmost loop including all its nested loop. Its not possible using SCCIterator to match a non-topmost loops.

Hmm, that's bizarre. Won't the SCCIterator go through all the SCCs? That is, let's say we have a topmost loop with blocks: A, B, C. And blocks B, C also form a loop.
Won't we get a separate SCC for A, B, C and B, C?

Hmm, that's bizarre. Won't the SCCIterator go through all the SCCs? That is, let's say we have a topmost loop with blocks: A, B, C. And blocks B, C also form a loop.
Won't we get a separate SCC for A, B, C and B, C?

It uses Tarjan's algorithm which only returns maximal connected subgraphs. Anything else would be infeasible. Think of a complete graph, every subset of nodes would by strongly connected (but not maximal), hence return an exponential number of strongly connected subgraphs. Only the entire complete graph is maximal and considered a component.

baziotis added a comment.EditedFeb 23 2020, 12:37 PM

Hmm, that's bizarre. Won't the SCCIterator go through all the SCCs? That is, let's say we have a topmost loop with blocks: A, B, C. And blocks B, C also form a loop.
Won't we get a separate SCC for A, B, C and B, C?

It uses Tarjan's algorithm which only returns maximal connected subgraphs. Anything else would be infeasible. Think of a complete graph, every subset of nodes would by strongly connected (but not maximal), hence return an exponential number of strongly connected subgraphs. Only the entire complete graph is maximal and considered a component.

Oh, that's quite bad. Big mistake on my part. I knew it was Tarjan and actually I had implemented but 3-4 years ago. Somehow, I remembered it gets every sub-graph. I just saw the algorithm again and this
makes both this patch wrong and the other.
Thank you very much Michael!

Edit: Actually, it probably doesn't make this patch wrong, given that I do it in LoopInfo but well...

lebedev.ri requested changes to this revision.Feb 25 2020, 12:17 PM
This revision now requires changes to proceed.Feb 25 2020, 12:17 PM

I think that with the news about SCCIterator, the implementation will be something like that:

bool isSCCNaturalLoop(scc_iterator<Function *> SCCIt) const {
  Loop *L = getLoopFor((*SCCIt).front());
  return L != nullptr;
}

This is basically a hack and I don't know if it makes sense to put it as part of LoopInfo. It doesn't do anything internal to LoopInfo that the user couldn't know.
What do you think?

baziotis abandoned this revision.Mar 3 2020, 2:37 PM

I think that with the news about SCCIterator, the implementation will be something like that:

bool isSCCNaturalLoop(scc_iterator<Function *> SCCIt) const {
  Loop *L = getLoopFor((*SCCIt).front());
  return L != nullptr;
}

This is basically a hack and I don't know if it makes sense to put it as part of LoopInfo. It doesn't do anything internal to LoopInfo that the user couldn't know.
What do you think?

Abandoning this as I personally don't see any value :/