This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Transforms/IPO/
-
Transforms/
-
IPO/
16/79
Attributor.cpp
-
test/Transforms/Attributor/
-
Transforms/
-
Attributor/
2
read_write_returned_arguments_scc.ll

Differential D74691

[Attributor] Detect possibly unbounded cycles in functions
ClosedPublic

Authored by omarahmed on Feb 16 2020, 7:22 AM.

Download Raw Diff

Details

Reviewers

jdoerfert
sstefan1
uenoku
baziotis

Commits

rGb285b333dcd0: [Attributor] Detect possibly unbounded cycles in functions

Summary

This patch add mayContainUnboundedCycle helper function which checks whether a function has any cycle which we don't know if it is bounded or not.
Loops with maximum trip count are considered bounded, any other cycle not.
It also contains some fixed tests and some added tests contain bounded and
unbounded loops and non-loop cycles.

Diff Detail

Repository

rG LLVM Github Monorepo

Build Status

Buildable 46859
Build 49513: arc lint + arc unit

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

omarahmed added inline comments.Feb 19 2020, 4:58 PM

llvm/lib/Transforms/IPO/Attributor.cpp
2421	the while version was making an infinte loop error , I think getParentLoop when there was no parent anymore it returned the same loop and not null so i fixed it by this first check .

omarahmed marked an inline comment as done.Feb 19 2020, 5:00 PM

omarahmed added inline comments.

llvm/lib/Transforms/IPO/Attributor.cpp
2421	oh i didn't see this continue part so i think now it is the one that makes that , i will edit it to the while version again :)

baziotis added inline comments.Feb 20 2020, 2:44 AM

llvm/lib/Transforms/IPO/Attributor.cpp
2421	:) Btw, the idea to do a do-while and return if there was no initial loop was very good! Also, I wrote the `while` assuming that if there's no parent loop, `getParentLoop()` will return `null`. I now see there's no documentation in `getParentLoop()`, but given that code: https://llvm.org/doxygen/LoopInfo_8h_source.html#l00097 which chases back `ParentLoop`, which is what `getParentLoop()` returns, it should return null when there's no parent.

@jdoerfert I have found examples in the willreturn.ll file for explicit tests for bounded and unbounded loops and I fixed them , does we need to add more complex examples like a bounded loop inside unbounded loop or this is redundant and not important ?

In D74691#1887241, @omarahmed wrote:

@jdoerfert I have found examples in the willreturn.ll file for explicit tests for bounded and unbounded loops and I fixed them , does we need to add more complex examples like a bounded loop inside unbounded loop or this is redundant and not important ?

I think you can first of all upload your diff and it will be easier to comment on what is missing. Some tests you can add (or check if there are some already in willreturn.ll) are basically stressing the algorithm.
That e.g. means add a quite deep loop. Add a non-loop SCC inside a loop. Add a loop that has max trip count inside another with max trip count (so you can check the moving upwards).

In D74691#1887414, @baziotis wrote:

In D74691#1887241, @omarahmed wrote:

@jdoerfert I have found examples in the willreturn.ll file for explicit tests for bounded and unbounded loops and I fixed them , does we need to add more complex examples like a bounded loop inside unbounded loop or this is redundant and not important ?

I think you can first of all upload your diff and it will be easier to comment on what is missing. Some tests you can add (or check if there are some already in willreturn.ll) are basically stressing the algorithm.
That e.g. means add a quite deep loop. Add a non-loop SCC inside a loop. Add a loop that has max trip count inside another with max trip count (so you can check the moving upwards).

Yes. Please keep the diff up to date so it is easier to follow.

[ Btw. if someone is up for it I have already the next extension in mind. though we can tackle that once this one is done ]

In D74691#1887772, @jdoerfert wrote:

In D74691#1887414, @baziotis wrote:

In D74691#1887241, @omarahmed wrote:

@jdoerfert I have found examples in the willreturn.ll file for explicit tests for bounded and unbounded loops and I fixed them , does we need to add more complex examples like a bounded loop inside unbounded loop or this is redundant and not important ?

I think you can first of all upload your diff and it will be easier to comment on what is missing. Some tests you can add (or check if there are some already in willreturn.ll) are basically stressing the algorithm.
That e.g. means add a quite deep loop. Add a non-loop SCC inside a loop. Add a loop that has max trip count inside another with max trip count (so you can check the moving upwards).

Yes. Please keep the diff up to date so it is easier to follow.

[ Btw. if someone is up for it I have already the next extension in mind. though we can tackle that once this one is done ]

I'm up for it. Although I'd like to wait for Omar to finish this one and see if he thinks he can tackle the next (assuming he wants to). :)

improve the docs and add tests

Harbormaster completed remote builds in B47084: Diff 246051.Feb 21 2020, 6:18 PM

improve the docs and add tests

I'm up for it. Although I'd like to wait for Omar to finish this one and see if he thinks he can tackle the next (assuming he wants to). :)

I'm ready for it :)

Harbormaster completed remote builds in B47085: Diff 246052.Feb 21 2020, 6:27 PM

I added a bunch of comments. Mostly minor. I think this is almost done and I like how it turned out. Once the comments are addressed I'm fine with this, @baziotis feel free to accept then.

The next step to improve willreturn is actually to allow loops with unknown bounds, e.g., link-list traversals but also SCCs that are not loops. During initialize we collect all the ones that might be endless, and in the updateImpl we have to "justify" why they are not. The justification criterion is: they do not contain any synchronizing instruction. AANoSync can help with identifying those ;)
@omarahmed Let me know if that is something you want to look into (eventually). If so, we can talk about it more.

[EDIT: Now that I think about it for another second, we should probably split that logic I describe above into a new abstract attribute, e.g., AALoopInformation. We also need to sync up with @uenoku, I remember vaguely that he looked into something at similar at some point but it's all pretty foggy...]

llvm/lib/Transforms/IPO/Attributor.cpp
2392–2421	Nit: No braces.
2396	Nit: Move the decleration of `SCCBBs` here (where it is used). Nit: You probably don't need the iterator but just `SCCBBs.front()`.
2400	Nit: Maybe `for (auto &It : make_range(scc_begin(&F), scc_end(&F)))` Or create a helper `scc_range(..)` that create the range.
2412	Style: No `else` after `return`.
2415	Style: go over the comments and make them all sentences, e.g., capital starting letter, dot at the end, dot in between sentences, etc. Nit: Remove `(Loop is Null)`
2422–2423	I have the feeling there is something missing. What if there is an outer SCC with no loop that is as big as the SCC? I think you need a matching loop with max trip count (not even constant btw.) for each SCC. That is, after the do-loop ended, check if `L` is null or not. If it is, we did not find a matching loop. (I'm unsure if that can happen wrt. the block order in SCCs but better save than sorry.)
llvm/test/Transforms/Attributor/willreturn.ll
596 ↗	(On Diff #246052)	I like the tests a lot. Can we add one or two more with irreducible control, hence SCCs that are not (natural) "loops".

In D74691#1887837, @jdoerfert wrote:

I added a bunch of comments. Mostly minor. I think this is almost done and I like how it turned out. Once the comments are addressed I'm fine with this, @baziotis feel free to accept then.

Ok, yes.

The next step to improve willreturn is actually to allow loops with unknown bounds, e.g., link-list traversals but also SCCs that are not loops. During initialize we collect all the ones that might be endless, and in the updateImpl we have to "justify" why they are not. The justification criterion is: they do not contain any synchronizing instruction. AANoSync can help with identifying those ;)
@omarahmed Let me know if that is something you want to look into (eventually). If so, we can talk about it more.

[EDIT: Now that I think about it for another second, we should probably split that logic I describe above into a new abstract attribute, e.g., AALoopInformation. We also need to sync up with @uenoku, I remember vaguely that he looked into something at similar at some point but it's all pretty foggy...]

I may be missing something because I'm not familiar with synchronizing instructions but why does that guarantee that a loop will end? Or why can we assume it will end?
I was reading the C11 standard:

An iteration statement whose controlling expression is not a constant expression, that performs no input/output operations, does not access volatile objects, and performs no synchronization or atomic operations in its body, controlling expression, or (in the case of a for statement) its expression-3, may be assumed by the implementation to terminate.

That is, basically we can assume a loop ends (even if it doesn't actually) if it is side-effect free. That is, it has no sync instructions, no read/write instructions and no volatile accesses.
I don't know if we follow the C11 standard but don't we fill only 1 of 3 if we only test for synchronizing instructions?

In D74691#1887970, @baziotis wrote:
In D74691#1887837, @jdoerfert wrote:

I added a bunch of comments. Mostly minor. I think this is almost done and I like how it turned out. Once the comments are addressed I'm fine with this, @baziotis feel free to accept then.

Ok, yes.

The next step to improve willreturn is actually to allow loops with unknown bounds, e.g., link-list traversals but also SCCs that are not loops. During initialize we collect all the ones that might be endless, and in the updateImpl we have to "justify" why they are not. The justification criterion is: they do not contain any synchronizing instruction. AANoSync can help with identifying those ;)
@omarahmed Let me know if that is something you want to look into (eventually). If so, we can talk about it more.

[EDIT: Now that I think about it for another second, we should probably split that logic I describe above into a new abstract attribute, e.g., AALoopInformation. We also need to sync up with @uenoku, I remember vaguely that he looked into something at similar at some point but it's all pretty foggy...]

I may be missing something because I'm not familiar with synchronizing instructions but why does that guarantee that a loop will end? Or why can we assume it will end?
I was reading the C11 standard:
An iteration statement whose controlling expression is not a constant expression, that performs no input/output operations, does not access volatile objects, and performs no synchronization or atomic operations in its body, controlling expression, or (in the case of a for statement) its expression-3, may be assumed by the implementation to terminate.
That is, basically we can assume a loop ends (even if it doesn't actually) if it is side-effect free. That is, it has no sync instructions, no read/write instructions and no volatile accesses.
I don't know if we follow the C11 standard but don't we fill only 1 of 3 if we only test for synchronizing instructions?

C/C++ atomics and volatile are considered synchronization, according to AANoSync, so we are good on that front I think. Or did you mean something else?

The one thing we should do however is start a discussion about a no-forward-process-guarantee attribute on llvm-dev. People that do not come from C/C++ already asked for that (https://bugs.llvm.org/show_bug.cgi?id=965). If you read the bug comments and look at the dates, we have that problem for a while and I think the function attribute is the cleanest solution here.

In D74691#1887983, @jdoerfert wrote:
In D74691#1887970, @baziotis wrote:
In D74691#1887837, @jdoerfert wrote:

I added a bunch of comments. Mostly minor. I think this is almost done and I like how it turned out. Once the comments are addressed I'm fine with this, @baziotis feel free to accept then.

Ok, yes.

The next step to improve willreturn is actually to allow loops with unknown bounds, e.g., link-list traversals but also SCCs that are not loops. During initialize we collect all the ones that might be endless, and in the updateImpl we have to "justify" why they are not. The justification criterion is: they do not contain any synchronizing instruction. AANoSync can help with identifying those ;)
@omarahmed Let me know if that is something you want to look into (eventually). If so, we can talk about it more.

[EDIT: Now that I think about it for another second, we should probably split that logic I describe above into a new abstract attribute, e.g., AALoopInformation. We also need to sync up with @uenoku, I remember vaguely that he looked into something at similar at some point but it's all pretty foggy...]

I may be missing something because I'm not familiar with synchronizing instructions but why does that guarantee that a loop will end? Or why can we assume it will end?
I was reading the C11 standard:
An iteration statement whose controlling expression is not a constant expression, that performs no input/output operations, does not access volatile objects, and performs no synchronization or atomic operations in its body, controlling expression, or (in the case of a for statement) its expression-3, may be assumed by the implementation to terminate.
That is, basically we can assume a loop ends (even if it doesn't actually) if it is side-effect free. That is, it has no sync instructions, no read/write instructions and no volatile accesses.
I don't know if we follow the C11 standard but don't we fill only 1 of 3 if we only test for synchronizing instructions?
C/C++ atomics and volatile are considered synchronization, according to AANoSync, so we are good on that front I think. Or did you mean something else?

That's what I figured by reading AANoSync but I had some wrong terminology in my head for the term "atomic". So, yes, you're right.

The one thing we should do however is start a discussion about a no-forward-process-guarantee attribute on llvm-dev. People that do not come from C/C++ already asked for that (https://bugs.llvm.org/show_bug.cgi?id=965). If you read the bug comments and look at the dates, we have that problem for a while and I think the function attribute is the cleanest solution here.

no-forward-process-guarantee is basically the attribute version of containsPossiblyEndlessLoop() right (when it returns true) ? (Although, I believe we should name this with process -> progress).
I don't quite get why this will be helpful though. I got that Reid Kleckner proposed forward-progress-guarantee so that we can put into functions in which we can remove infinite loops with no side-effects.

Btw, the search for some loops may need to happen in AAUB, since it is considered UB if I understood correctly.

In D74691#1887995, @baziotis wrote:
In D74691#1887983, @jdoerfert wrote:
In D74691#1887970, @baziotis wrote:
In D74691#1887837, @jdoerfert wrote:

I added a bunch of comments. Mostly minor. I think this is almost done and I like how it turned out. Once the comments are addressed I'm fine with this, @baziotis feel free to accept then.

Ok, yes.

The next step to improve willreturn is actually to allow loops with unknown bounds, e.g., link-list traversals but also SCCs that are not loops. During initialize we collect all the ones that might be endless, and in the updateImpl we have to "justify" why they are not. The justification criterion is: they do not contain any synchronizing instruction. AANoSync can help with identifying those ;)
@omarahmed Let me know if that is something you want to look into (eventually). If so, we can talk about it more.

[EDIT: Now that I think about it for another second, we should probably split that logic I describe above into a new abstract attribute, e.g., AALoopInformation. We also need to sync up with @uenoku, I remember vaguely that he looked into something at similar at some point but it's all pretty foggy...]

I may be missing something because I'm not familiar with synchronizing instructions but why does that guarantee that a loop will end? Or why can we assume it will end?
I was reading the C11 standard:
An iteration statement whose controlling expression is not a constant expression, that performs no input/output operations, does not access volatile objects, and performs no synchronization or atomic operations in its body, controlling expression, or (in the case of a for statement) its expression-3, may be assumed by the implementation to terminate.
That is, basically we can assume a loop ends (even if it doesn't actually) if it is side-effect free. That is, it has no sync instructions, no read/write instructions and no volatile accesses.
I don't know if we follow the C11 standard but don't we fill only 1 of 3 if we only test for synchronizing instructions?
C/C++ atomics and volatile are considered synchronization, according to AANoSync, so we are good on that front I think. Or did you mean something else?
That's what I figured by reading AANoSync but I had some wrong terminology in my head for the term "atomic". So, yes, you're right.

The one thing we should do however is start a discussion about a no-forward-process-guarantee attribute on llvm-dev. People that do not come from C/C++ already asked for that (https://bugs.llvm.org/show_bug.cgi?id=965). If you read the bug comments and look at the dates, we have that problem for a while and I think the function attribute is the cleanest solution here.

no-forward-process-guarantee is basically the attribute version of containsPossiblyEndlessLoop() right (when it returns true) ? (Although, I believe we should name this with process -> progress).
I don't quite get why this will be helpful though. I got that Reid Kleckner proposed forward-progress-guarantee so that we can put into functions in which we can remove infinite loops with no side-effects.

Btw, the search for some loops may need to happen in AAUB, since it is considered UB if I understood correctly.

So infinite loops without progress are UB, I think, but loops that might be infinite without progress are not necessarily. We can remove the latter and insert an unreachable for UB as always.

The attribute would actually allow or prevent the above, deepening which default we choose. Basically opt-in or out of the forward progress which enables or disables the proposed reasoning.

In D74691#1888062, @jdoerfert wrote:
In D74691#1887995, @baziotis wrote:
In D74691#1887983, @jdoerfert wrote:
In D74691#1887970, @baziotis wrote:
In D74691#1887837, @jdoerfert wrote:

I added a bunch of comments. Mostly minor. I think this is almost done and I like how it turned out. Once the comments are addressed I'm fine with this, @baziotis feel free to accept then.

Ok, yes.

The next step to improve willreturn is actually to allow loops with unknown bounds, e.g., link-list traversals but also SCCs that are not loops. During initialize we collect all the ones that might be endless, and in the updateImpl we have to "justify" why they are not. The justification criterion is: they do not contain any synchronizing instruction. AANoSync can help with identifying those ;)
@omarahmed Let me know if that is something you want to look into (eventually). If so, we can talk about it more.

[EDIT: Now that I think about it for another second, we should probably split that logic I describe above into a new abstract attribute, e.g., AALoopInformation. We also need to sync up with @uenoku, I remember vaguely that he looked into something at similar at some point but it's all pretty foggy...]

I may be missing something because I'm not familiar with synchronizing instructions but why does that guarantee that a loop will end? Or why can we assume it will end?
I was reading the C11 standard:
An iteration statement whose controlling expression is not a constant expression, that performs no input/output operations, does not access volatile objects, and performs no synchronization or atomic operations in its body, controlling expression, or (in the case of a for statement) its expression-3, may be assumed by the implementation to terminate.
That is, basically we can assume a loop ends (even if it doesn't actually) if it is side-effect free. That is, it has no sync instructions, no read/write instructions and no volatile accesses.
I don't know if we follow the C11 standard but don't we fill only 1 of 3 if we only test for synchronizing instructions?
C/C++ atomics and volatile are considered synchronization, according to AANoSync, so we are good on that front I think. Or did you mean something else?
That's what I figured by reading AANoSync but I had some wrong terminology in my head for the term "atomic". So, yes, you're right.

The one thing we should do however is start a discussion about a no-forward-process-guarantee attribute on llvm-dev. People that do not come from C/C++ already asked for that (https://bugs.llvm.org/show_bug.cgi?id=965). If you read the bug comments and look at the dates, we have that problem for a while and I think the function attribute is the cleanest solution here.

no-forward-process-guarantee is basically the attribute version of containsPossiblyEndlessLoop() right (when it returns true) ? (Although, I believe we should name this with process -> progress).
I don't quite get why this will be helpful though. I got that Reid Kleckner proposed forward-progress-guarantee so that we can put into functions in which we can remove infinite loops with no side-effects.

Btw, the search for some loops may need to happen in AAUB, since it is considered UB if I understood correctly.
So infinite loops without progress are UB, I think

Me too, because of the C11 standard excerpt I quoted above.

but loops that might be infinite without progress are not necessarily. We can remove the latter and insert an unreachable for UB as always.

Yes, that's what I meant. Assuming that here you meant "with", not "without" and "former", not "latter". Or otherwise I missed something.

The attribute would actually allow or prevent the above, deepening which default we choose. Basically opt-in or out of the forward progress which enables or disables the proposed reasoning.

Oh, ok. So, my assumption was:

forward-progress would mean: You can remove side-effect-free infinite loops. One implication of that is: This function _definitely_ forwards progress although it might never return.

This is also how I think people in llvm-dev wanted to use that: If you're compiling a C/C++ function, put this attribute so you can do this optimization. Otherwise, don't, so you can't.

On the other hand, no-forward-progress-guarantee I think does not make sense (because of its name) to mean: You can't remove such loops. I got that it means: "This function may get into a side-effect-free infinite loop".

Which is a piece of information nonetheless.

All in all, definitely I'm not the one to decide but I would be glad if you could clarify which of the above are planned (or you plan) to be added.

omarahmed marked an inline comment as done.Feb 22 2020, 12:20 PM

omarahmed added inline comments.

llvm/lib/Transforms/IPO/Attributor.cpp
2400	I am kind of stuck in using hasLoop function with using these method for looping :) , so how could I change It( vector<Block>) to SCC_iter<Function > type

In D74691#1888074, @baziotis wrote:

In D74691#1888062, @jdoerfert wrote:

In D74691#1887995, @baziotis wrote:

[...]
Btw, the search for some loops may need to happen in AAUB, since it is considered UB if I understood correctly.

So infinite loops without progress are UB, I think

Me too, because of the C11 standard excerpt I quoted above.

but loops that might be infinite without progress are not necessarily. We can remove the latter and insert an unreachable for UB as always.

Yes, that's what I meant. Assuming that here you meant "with", not "without" and "former", not "latter". Or otherwise I missed something.

Assuming forward-progress-guarantees, as we do in LLVM now, you can:

Remove a loop that has no side-effect/sync, regardless of the trip count.
Replace a loop that has no side-effect/sync with an unreachable, if it is proven endless.

The second transformation is "stronger" but also more restrictive. For example,

while(c) {}

cannot be replaced with unreachable but can be removed if you don't know anything about c. You
can replace the loop with llvm.assume(!c) if you want.

The attribute would actually allow or prevent the above, deepening which default we choose. Basically opt-in or out of the forward progress which enables or disables the proposed reasoning.

Oh, ok. So, my assumption was:

forward-progress would mean: You can remove side-effect-free infinite loops. One implication of that is: This function _definitely_ forwards progress although it might never return.

This is also how I think people in llvm-dev wanted to use that: If you're compiling a C/C++ function, put this attribute so you can do this optimization. Otherwise, don't, so you can't.

On the other hand, no-forward-progress-guarantee I think does not make sense (because of its name) to mean: You can't remove such loops. I got that it means: "This function may get into a side-effect-free infinite loop".

Which is a piece of information nonetheless.

no-forward-progress-guarantee would effectively mean that you cannot remove or replace loops that do not have side-effects/syncs. On way of thinking of this is: (non-)termination becomes observable and threads might be interrupted by events even if there is no direct control flow path.

All in all, definitely I'm not the one to decide but I would be glad if you could clarify which of the above are planned (or you plan) to be added.

I'm not necessarily favoring a default but we need to propose this as an RFC on the list and see where people stand.

llvm/lib/Transforms/IPO/Attributor.cpp
2400	Never mind then ;)

minor updates and added tests

Harbormaster completed remote builds in B47098: Diff 246096.Feb 22 2020, 9:26 PM

I'm OK with this. @baziotis @uenoku please take a look, comment and/or accept it as you see fit.

llvm/test/Transforms/Attributor/willreturn.ll
707 ↗	(On Diff #246096)	New test cases are great, not irreducible but still.

I just saw, the commit subject and message needs to be written though. Explaining what this patch does.

[Attributor]Detect SCCs with unbounded cycles

Updating D74691: [Attributor] add some pattern to containsCycle #
Enter a brief description of the changes included in this update.
The first line is used as subject, next lines as comment. #
If you intended to create a new revision, use:
$ arc diff --create

This patch implements containsUnboundedCycle which checks for unbounded cycle
which is either a loop that does not have a maximal trip count or any other
cycle.

It contains some fixes for willreturn tests and addition of bounded
and unbounded loop tests.

Harbormaster completed remote builds in B47101: Diff 246101.Feb 23 2020, 4:29 AM

In D74691#1888139, @jdoerfert wrote:
In D74691#1888074, @baziotis wrote:

In D74691#1888062, @jdoerfert wrote:

In D74691#1887995, @baziotis wrote:

[...]
Btw, the search for some loops may need to happen in AAUB, since it is considered UB if I understood correctly.

So infinite loops without progress are UB, I think

Me too, because of the C11 standard excerpt I quoted above.

but loops that might be infinite without progress are not necessarily. We can remove the latter and insert an unreachable for UB as always.

Yes, that's what I meant. Assuming that here you meant "with", not "without" and "former", not "latter". Or otherwise I missed something.

Assuming forward-progress-guarantees, as we do in LLVM now, you can:

Remove a loop that has no side-effect/sync, regardless of the trip count.

Replace a loop that has no side-effect/sync with an unreachable, if it is proven endless.

The second transformation is "stronger" but also more restrictive. For example,
while(c) {}
cannot be replaced with unreachable but can be removed if you don't know anything about c. You
can replace the loop with llvm.assume(!c) if you want.

Ok, clearer, thanks!

The attribute would actually allow or prevent the above, deepening which default we choose. Basically opt-in or out of the forward progress which enables or disables the proposed reasoning.

Oh, ok. So, my assumption was:

forward-progress would mean: You can remove side-effect-free infinite loops. One implication of that is: This function _definitely_ forwards progress although it might never return.

This is also how I think people in llvm-dev wanted to use that: If you're compiling a C/C++ function, put this attribute so you can do this optimization. Otherwise, don't, so you can't.

On the other hand, no-forward-progress-guarantee I think does not make sense (because of its name) to mean: You can't remove such loops. I got that it means: "This function may get into a side-effect-free infinite loop".

Which is a piece of information nonetheless.

no-forward-progress-guarantee would effectively mean that you cannot remove or replace loops that do not have side-effects/syncs. On way of thinking of this is: (non-)termination becomes observable and threads might be interrupted by events even if there is no direct control flow path.

Yes, it makes sense.

All in all, definitely I'm not the one to decide but I would be glad if you could clarify which of the above are planned (or you plan) to be added.

I'm not necessarily favoring a default but we need to propose this as an RFC on the list and see where people stand.

Me neither, and I'm positive to the RFC of course. If I can help in any way, ping me. I'll definitely take part in the review. :)

llvm/lib/Transforms/IPO/Attributor.cpp
2385	There's https://reviews.llvm.org/D74801. If it gets committed before this, I'll notify you to just change the naming.
2394	To me, that's a little bit inaccurate. It seems like this `if` checks if the SCC is a loop or not. But that's actually the whole code that follows. So, I'd replace it with a comment in the line above (exactly above the `getForLoop()`) since these 2 are coupled. `If any random block in this SCC does not belong to a loop, then this SCC is definitely not a loop.`
2398–2400	One note is that we get the innermost loop with `getLoopFor()` rather than the outermost. Also, the part about the loop cycle was unclear to me. Still, I feel breaking it down into 3 cases is clearer but here's another take: L is the innermost loop that has a common block with the SCC. Since a loop is always an SCC, if their number of blocks are equal, the SCC is a loop - specifically, L. Otherwise, there are 2 cases: - If the SCC has less blocks, then it is definitely not a loop. - If it has more, then we can't decide since the SCC can be a parent loop of L. So, we perform the same test for the parent of L.
2401–2402	The SCC might have a loop inside, but it might not be a loop itself. I propose something like: If L is nullptr, we found no loop that matches exactly the number of blocks of the SCC and so the SCC is not a loop Feel free to revise it but still remain accurate. :)

In D74691#1888202, @omarahmed wrote:

[Attributor]Detect SCCs with unbounded cycles

Updating D74691: [Attributor] add some pattern to containsCycle #

Enter a brief description of the changes included in this update.

The first line is used as subject, next lines as comment. #

If you intended to create a new revision, use:

$ arc diff --create

You don't need this part anywhere.

This patch implements containsUnboundedCycle which checks for unbounded cycle
which is either a loop that does not have a maximal trip count or any other
cycle.

It contains some fixes for willreturn tests and addition of bounded
and unbounded loop tests.

That commit message I think is very good. It's good if you put it in the summary as well.

omarahmed edited the summary of this revision. (Show Details)Feb 23 2020, 11:43 AM

omarahmed retitled this revision from [Attributor] add some pattern to containsCycle to [Attributor] Detect SCCs with unbounded cycles.

[Attributor]Detect SCCs with unbounded cycles

This patch implements containsUnboundedCycle which checks for unbounded cycle
which is either a loop that does not have a maximal trip count or any other
cycle.

It contains some fixes for willreturn tests and addition of bounded
and unbounded loop tests.

Harbormaster completed remote builds in B47103: Diff 246118.Feb 23 2020, 11:54 AM

The patch description is not very useful.
It should actually explain the patch, and why that is the correct solution.

llvm/lib/Transforms/IPO/Attributor.cpp
2369	The old comment is strictly more understandable. New comment does not mention at all that we are checking inside functions only.
2388	s/if/whether/

baziotis added inline comments.Feb 23 2020, 12:00 PM

llvm/lib/Transforms/IPO/Attributor.cpp
2369	Yeah, the addition that we check inside a function would be good. Otherwise, I think the new one is clearer on what it is considered an unbounded cycle in this case.

@omarahmed @jdoerfert

I'm in the unpleasant position to tell you that the method I proposed is wrong. SCCIterator uses Tarjan's algorithm which for some reason, I remembered it finds all strongly-connected components but actually, it finds only maximal.
That means, if a loop has an SCC inside it, we'll never find it since the loop is the maximal SCC. That probably means that we have to fall-back to the method that Johannes had proposed:

As before, we identify all cycles via the depth-first search and the visited set. If a cycles is found we did bail before but now we ask LI & SE instead. If they say we found a proper loop header of a loop with a bounded trip count we can ignore that cycles and continue exploring.

I'm really sorry for that. I'll try to think if anything better can be done.

Edit: Let's not forget of course: A big thanks to @Meinersbur for pointing this out.

baziotis added a subscriber: Meinersbur.Feb 23 2020, 12:48 PM

In D74691#1888396, @baziotis wrote:
@omarahmed @jdoerfert

I'm in the unpleasant position to tell you that the method I proposed is wrong. SCCIterator uses Tarjan's algorithm which for some reason, I remembered it finds all strongly-connected components but actually, it finds only maximal.
That means, if a loop has an SCC inside it, we'll never find it since the loop is the maximal SCC. That probably means that we have to fall-back to the method that Johannes had proposed:
As before, we identify all cycles via the depth-first search and the visited set. If a cycles is found we did bail before but now we ask LI & SE instead. If they say we found a proper loop header of a loop with a bounded trip count we can ignore that cycles and continue exploring.
I'm really sorry for that. I'll try to think if anything better can be done.

Edit: Let's not forget of course: A big thanks to @Meinersbur for pointing this out.

that's fine , no problem :D
I think i can move faster now as i know the small mistakes i have done till now from the great reviews :)

But if it finds only the maximum ones why it hadn't returned willreturn here , as it shouldn't have seen the while inside

; int non_loop_inside_loop(int n) {
;   int ans = 0;
;   for (int i = 0; i < n; i++) {
;     while (1)
;       ans++;
;   }
;   return ans;
; }

In D74691#1888442, @omarahmed wrote:
In D74691#1888396, @baziotis wrote:
@omarahmed @jdoerfert

I'm in the unpleasant position to tell you that the method I proposed is wrong. SCCIterator uses Tarjan's algorithm which for some reason, I remembered it finds all strongly-connected components but actually, it finds only maximal.
That means, if a loop has an SCC inside it, we'll never find it since the loop is the maximal SCC. That probably means that we have to fall-back to the method that Johannes had proposed:
As before, we identify all cycles via the depth-first search and the visited set. If a cycles is found we did bail before but now we ask LI & SE instead. If they say we found a proper loop header of a loop with a bounded trip count we can ignore that cycles and continue exploring.
I'm really sorry for that. I'll try to think if anything better can be done.

Edit: Let's not forget of course: A big thanks to @Meinersbur for pointing this out.
that's fine , no problem :D
I think i can move faster now as i know the small mistakes i have done till now from the great reviews :)

But if it finds only the maximum ones why it hadn't returned willreturn here , as it shouldn't have seen the while inside
; int non_loop_inside_loop(int n) {
;   int ans = 0;
;   for (int i = 0; i < n; i++) {
;     while (1)
;       ans++;
;   }
;   return ans;
; }

It's good to refer to the LLVM IR, rather than the C/C++ because they can have multiple LLVM IR translations. Just to be sure we're talking about the same thing. :)
In this case, I assume you meant the relevant test case in willreturn.ll. If you put it in a file on its own and run it through the Attributor, with a couple of prints, you'll
see that NoAnalysis is true (btw, I recommend that you try to answer questions by following the code - it's always a learning experience).
I don't know why and that's a good question (on another topic, but still). Interestingly, both LI and SE are null. So, for some reason, for this function, we couldn't
get either of those analyses. I'll try to find some time to check tomorrow.

In D74691#1888455, @baziotis wrote:
In D74691#1888442, @omarahmed wrote:
In D74691#1888396, @baziotis wrote:
@omarahmed @jdoerfert

I'm in the unpleasant position to tell you that the method I proposed is wrong. SCCIterator uses Tarjan's algorithm which for some reason, I remembered it finds all strongly-connected components but actually, it finds only maximal.
That means, if a loop has an SCC inside it, we'll never find it since the loop is the maximal SCC. That probably means that we have to fall-back to the method that Johannes had proposed:
As before, we identify all cycles via the depth-first search and the visited set. If a cycles is found we did bail before but now we ask LI & SE instead. If they say we found a proper loop header of a loop with a bounded trip count we can ignore that cycles and continue exploring.
I'm really sorry for that. I'll try to think if anything better can be done.

Edit: Let's not forget of course: A big thanks to @Meinersbur for pointing this out.
that's fine , no problem :D
I think i can move faster now as i know the small mistakes i have done till now from the great reviews :)

But if it finds only the maximum ones why it hadn't returned willreturn here , as it shouldn't have seen the while inside
; int non_loop_inside_loop(int n) {
;   int ans = 0;
;   for (int i = 0; i < n; i++) {
;     while (1)
;       ans++;
;   }
;   return ans;
; }
It's good to refer to the LLVM IR, rather than the C/C++ because they can have multiple LLVM IR translations. Just to be sure we're talking about the same thing. :)
In this case, I assume you meant the relevant test case in willreturn.ll. If you put it in a file on its own and run it through the Attributor, with a couple of prints, you'll
see that NoAnalysis is true (btw, I recommend that you try to answer questions by following the code - it's always a learning experience).
I don't know why and that's a good question (on another topic, but still). Interestingly, both LI and SE are null. So, for some reason, for this function, we couldn't
get either of those analyses. I'll try to find some time to check tomorrow.

that's a great tip,thank you, you are right :)

btw, I recommend that you try to answer questions by following the code - it's always a learning experience).

Since we took that road, let's have a little more fun. Let's start simple by printing the SCCs we get. A simple way to do that is with sth like:

for (scc_iterator<Function *> It = scc_begin(&F), IE = scc_end(&F); It != IE;
     ++It) {

  const std::vector<BasicBlock *> &SCCBBs = *It;
  for (BasicBlock *BB : SCCBBs) {
    dbgs() << *BB << "\n";
  }
  dbgs() << "----- END -----\n\n\n\n\n";
}

But, we can actually do better if we take a graphical view of the CFG, with view-cfg. You can do that with sth like:
./bin/opt -view-cfg test.ll
Assuming that you are in the llvm-project dir, that you have built opt and that your file is test.ll. Now, this will generate a .dot file.
This is a special "graphics" format for which we should not care about right now. If you do that, you'll probably see something like:

...
Writing '/tmp/cfgnon_loop_inside_loop-a0c3a3.dot'...  done. 
Trying 'xdg-open' program... Remember to erase graph file: /tmp/cfgnon_loop_inside_loop-a0c3a3.dot
gio: file:///tmp/cfgnon_loop_inside_loop-a0c3a3.dot: No application is registered as handling this file

For me, it created /tmp/cfgnon_loop_inside_loop-a0c3a3.dot (you won't necessarily have the same name, that's ok). Then, it will try to find
default program to open it (with the xdg-open command), which as you can see, for me it didn't work. Basically, you now have to have a program that understands
that file. The dot app will do the job, but you'll probably won't have it by default. Search online for how to install graphviz package. For example, in Ubuntu I think you can do it with sudo apt-get install graphviz.
Finally, you should be able to create a PDF out of the .dot file as:
dot -Tpdf /tmp/cfgnon_loop_inside_loop-a0c3a3.dot -o <pdf_filename>.pdf
Then, you can open the pdf and expect to see sth like this: https://imgur.com/a/oW2xgNt
I think viewing CFGs like that, at times can be very helpful. In this case for example, it becomes instantly apparent that the for and the while don't form a SCC.

In D74691#1888461, @baziotis wrote:
btw, I recommend that you try to answer questions by following the code - it's always a learning experience).

Since we took that road, let's have a little more fun. Let's start simple by printing the SCCs we get. A simple way to do that is with sth like:
for (scc_iterator<Function *> It = scc_begin(&F), IE = scc_end(&F); It != IE;
     ++It) {

  const std::vector<BasicBlock *> &SCCBBs = *It;
  for (BasicBlock *BB : SCCBBs) {
    dbgs() << *BB << "\n";
  }
  dbgs() << "----- END -----\n\n\n\n\n";
}
But, we can actually do better if we take a graphical view of the CFG, with view-cfg. You can do that with sth like:
./bin/opt -view-cfg test.ll
Assuming that you are in the llvm-project dir, that you have built opt and that your file is test.ll. Now, this will generate a .dot file.
This is a special "graphics" format for which we should not care about right now. If you do that, you'll probably see something like:
...
Writing '/tmp/cfgnon_loop_inside_loop-a0c3a3.dot'...  done. 
Trying 'xdg-open' program... Remember to erase graph file: /tmp/cfgnon_loop_inside_loop-a0c3a3.dot
gio: file:///tmp/cfgnon_loop_inside_loop-a0c3a3.dot: No application is registered as handling this file
For me, it created /tmp/cfgnon_loop_inside_loop-a0c3a3.dot (you won't necessarily have the same name, that's ok). Then, it will try to find
default program to open it (with the xdg-open command), which as you can see, for me it didn't work. Basically, you now have to have a program that understands
that file. The dot app will do the job, but you'll probably won't have it by default. Search online for how to install graphviz package. For example, in Ubuntu I think you can do it with sudo apt-get install graphviz.
Finally, you should be able to create a PDF out of the .dot file as:
dot -Tpdf /tmp/cfgnon_loop_inside_loop-a0c3a3.dot -o <pdf_filename>.pdf
Then, you can open the pdf and expect to see sth like this: https://imgur.com/a/oW2xgNt
I think viewing CFGs like that, at times can be very helpful. In this case for example, it becomes instantly apparent that the for and the while don't form a SCC.

wow, that grapical approach is beautiful , thanks

regarding the old approach i tested it with the if-then-else test i listed before

define i32* @external_sink_ret2_nrw(i32* %n0, i32* %r0, i32* %w0) {
entry:
  %tobool = icmp ne i32* %n0, null
  br i1 %tobool, label %if.end, label %if.then

if.then:                                          ; preds = %entry
  br label %return

if.end:                                           ; preds = %entry
  %0 = load i32, i32* %r0, align 4
  store i32 %0, i32* %w0, align 4
  br label %return

return:                                           ; preds = %if.end, %if.then
  ret i32* %w0
}

and it seems to work fine with it after adding the analysis and maxTripCount part but it fails in another tests , I will work on them tmw and submit another diff

In D74691#1888491, @omarahmed wrote:
In D74691#1888461, @baziotis wrote:
btw, I recommend that you try to answer questions by following the code - it's always a learning experience).

Since we took that road, let's have a little more fun. Let's start simple by printing the SCCs we get. A simple way to do that is with sth like:
for (scc_iterator<Function *> It = scc_begin(&F), IE = scc_end(&F); It != IE;
     ++It) {

  const std::vector<BasicBlock *> &SCCBBs = *It;
  for (BasicBlock *BB : SCCBBs) {
    dbgs() << *BB << "\n";
  }
  dbgs() << "----- END -----\n\n\n\n\n";
}
But, we can actually do better if we take a graphical view of the CFG, with view-cfg. You can do that with sth like:
./bin/opt -view-cfg test.ll
Assuming that you are in the llvm-project dir, that you have built opt and that your file is test.ll. Now, this will generate a .dot file.
This is a special "graphics" format for which we should not care about right now. If you do that, you'll probably see something like:
...
Writing '/tmp/cfgnon_loop_inside_loop-a0c3a3.dot'...  done. 
Trying 'xdg-open' program... Remember to erase graph file: /tmp/cfgnon_loop_inside_loop-a0c3a3.dot
gio: file:///tmp/cfgnon_loop_inside_loop-a0c3a3.dot: No application is registered as handling this file
For me, it created /tmp/cfgnon_loop_inside_loop-a0c3a3.dot (you won't necessarily have the same name, that's ok). Then, it will try to find
default program to open it (with the xdg-open command), which as you can see, for me it didn't work. Basically, you now have to have a program that understands
that file. The dot app will do the job, but you'll probably won't have it by default. Search online for how to install graphviz package. For example, in Ubuntu I think you can do it with sudo apt-get install graphviz.
Finally, you should be able to create a PDF out of the .dot file as:
dot -Tpdf /tmp/cfgnon_loop_inside_loop-a0c3a3.dot -o <pdf_filename>.pdf
Then, you can open the pdf and expect to see sth like this: https://imgur.com/a/oW2xgNt
I think viewing CFGs like that, at times can be very helpful. In this case for example, it becomes instantly apparent that the for and the while don't form a SCC.
wow, that grapical approach is beautiful , thanks

No problem :)

regarding the old approach i tested it with the if-then-else test i listed before
define i32* @external_sink_ret2_nrw(i32* %n0, i32* %r0, i32* %w0) {
entry:
  %tobool = icmp ne i32* %n0, null
  br i1 %tobool, label %if.end, label %if.then

if.then:                                          ; preds = %entry
  br label %return

if.end:                                           ; preds = %entry
  %0 = load i32, i32* %r0, align 4
  store i32 %0, i32* %w0, align 4
  br label %return

return:                                           ; preds = %if.end, %if.then
  ret i32* %w0
}
and it seems to work fine with it after adding the analysis and maxTripCount part but it fails in another tests , I will work on them tmw and submit another diff

Ok, good. I'll check the method again tomorrow.

[Attributor]Detect functions with unbounded loops

This patch changes the approach of containsCycle function of looping on
the CFG from dfs to bfs as the dfs approach detects if-then-else part in
functions as a loop and that result in wrong behaviour, so bfs solves that
problem by only moving level by level on the CFG and only detects loops when
there is an edge to a BB in a visited level.
This patch also uses maxTripCount to detect unbounded loops in function.
It also contains some fixed tests and some added tests contain bounded and
unbounded loops.

omarahmed retitled this revision from [Attributor] Detect SCCs with unbounded cycles to [Attributor]Detect functions with unbounded loops.Feb 24 2020, 4:07 PM

omarahmed edited the summary of this revision. (Show Details)

Harbormaster completed remote builds in B47167: Diff 246332.Feb 24 2020, 4:09 PM

lebedev.ri added inline comments.Feb 25 2020, 12:02 AM

llvm/lib/Transforms/IPO/Attributor.cpp
2371–2372	I'm a little bit lost here, if a loop is not known to have `SmallConstantMaxTripCount`, why does that mean it's unbounded?

In D74691#1890385, @omarahmed wrote:

[Attributor]Detect functions with unbounded loops

Please change that to "[Attributor] Detect possibly unbounded cycles in functions"

This patch changes the approach of containsCycle function of looping on
the CFG from dfs to bfs as the dfs approach detects if-then-else part in
functions as a loop and that result in wrong behaviour, so bfs solves that
problem by only moving level by level on the CFG and only detects loops when
there is an edge to a BB in a visited level.
This patch also uses maxTripCount to detect unbounded loops in function.
It also contains some fixed tests and some added tests contain bounded and
unbounded loops.

This message is good, but we will make it better. For now, certainly the old method
is wrong, I just saw it, but also this method is wrong.
First of all, the old method is wrong because of the example you posted. Generally,
that's not a good way to test cycles because the idea is to do a sort of O(n^2), which means
marking nodes visited but then marking them unvisited. You have to do in a kind of recursive
algorithm where as you forward (and deeper in the recursion) you mark them and as you come
back (returning from the recursion) you unmark them. I'll come back to this in a bit.

Regarding your method, consider this case:

define i32* @test2(i32* %n0, i32* %r0, i32* %w0) {
entry:
  %tobool = icmp ne i32* %n0, null
  br i1 %tobool, label %l2, label %l1

l1:
  br label %l2

l2:
  br label %return

return:
  ret i32* %w0
}

Visually (and you can see this with view-cfg) it looks like that: https://imgur.com/a/n8iG7Ig (using my awful drawing skills)
And we construct the branch to first visit the l2 (by putting in the true label of br) so that when l1 does BFS.
It detects it as a cycle but it's not.
Certainly, BFS is not a popular way to do cycle-detection and I'd abstain from that as all the algorithmic literature
that I know of uses some sort of DFS for this purpose.
So, for now, and we may be able to make it better, you can use something like this:

bool containsCycle(BasicBlock *BB, SmallPtrSet<BasicBlock *, 32> &Visited, SmallPtrSet<BasicBlock *, 32> &Processed) {
  if (Processed.count(BB)) {
    Visited.erase(BB);
    return false;
  }
  Processed.insert(BB);
  Visited.insert(BB);
  for (auto *SuccBB : successors(BB)) {
    if (!Processed.count(SuccBB) && containsCycle(SuccBB, Visited, Processed))
      return true;
    else if (Visited.count(SuccBB))
      return true;
  }
  Visited.erase(BB);
  return false;
}

This function works as a utility and you initialize the 2 sets in the main function.
I wrote this quickly but I think it works. Note the usage of the 2 sets:

Processed helps us not do the same work for the same starting BB. It does not help us identify cycles.
This is done by the Visited set. The trick here is that when returning from the recursion, we mark the nodes unvisited, so we can start search in a different tree.

Edit: I'll come back with how we can make maxTripCount fit into this.

llvm/lib/Transforms/IPO/Attributor.cpp
2369–2371	Here, you want something clearer to what the function does. I'd say, you can go with: "Helper function that checks whether a function has any cycle which we don't know is bounded. Loops with maximum trip count are considered bounded, any other cycle not." We may have to revise this further but it seems a good start.
2371–2372	It doesn't. In a previous review the name `mayContainUnboundedCycle` was proposed IIRC, which I prefer.

omarahmed retitled this revision from [Attributor]Detect functions with unbounded loops to [Attributor] Detect possibly unbounded cycles in functions.Feb 25 2020, 8:13 AM

Edit: I'll come back with how we can make maxTripCount fit into this.

So, I'm back. Before continuing, please tell me if everything in this algorithm was clear.
If there are any questions, please ask. So, this algorithm should be strictly better
than the original but still we haven't used LoopInfo and SCEV.

Here's another quick version that I think does that. Btw, please don't code anything yet
because especially after my last mistake I want to be extra sure that what I propose
is totally correct. Here we go:

Ok, in the version I posted above, in the else if, where we check if the successor is
already visited, if it is true, we just found a cycle. But, how do we check
whether this cycle is a loop or not? One way is to getLoopFor() for a node of
the cycle and if we do get a Loop back, check if the size of the loop with the size
of the cycle is the same. If yes, it is a loop, specifically the one we got. This is a little
subtle, so to understand this, think that a loop is always a SCC (same for the cycle). Now,
2 SCCs are either completely disjoint (no common node) or they make one big SCC.
Since the loop and the cycle do have one common node (the one we used in getLoopFor()),
they are in the same SCC. So, if the cycle and the loop have the same number of nodes,
then they're the same thing.
But, how do we get the number of nodes?
Well, in the else if, where we find a cycle, one start of the cycle is
SuccBB. The thing is, going up the recursion, we have already seen this
node (that's why it's visited). So, going down the recursion (i.e. returning)
we're going to see it again. If we save what is this starting node and count
how many nodes we go down until we find it again, this is the size of the cycle.
The code looks something like this:

static SmallPtrSet<BasicBlock *, 32> *Visited;
static SmallPtrSet<BasicBlock *, 32> *Processed;
static BasicBlock *StartBB;
static unsigned BBCount;
bool containsCycleUtil(BasicBlock *BB) {
  if (Processed->count(BB)) {
    Visited->erase(BB);
    return false;
  }
  Processed->insert(BB);
  Visited->insert(BB);
  for (auto *SuccBB : successors(BB)) {
    if (!Processed->count(SuccBB) && containsCycleUtil(SuccBB)) {
      ++BBCount;
      // Continue going down, we haven't found
      // the start yet.
      if (StartBB != BB) {
        return true;
      }
    } else if (Visited->count(SuccBB)) {
      // Found cycle, save the start.
      StartBB = SuccBB;
      ++BBCount;
      return true;
    }
  }
  Visited->erase(BB);
  return false;
}

(I just moved the invariants to globals so that they're not passed
back and forth in the recursion).

What remains is to utilize the LoopInfo. I have
an algorithm but before doing that we have to solve an important problem. I don't seem to be able
to get LoopInfo, it's always null. I found that I probably
need to add -loops -analyze in the command-line but still
it doesn't work. @jdoerfert any idea?

bool containsCycle(BasicBlock *BB, SmallPtrSet<BasicBlock *, 32> &Visited, SmallPtrSet<BasicBlock *, 32> &Processed) {
  if (Processed.count(BB)) {
    Visited.erase(BB);
    return false;
  }
  Processed.insert(BB);
  Visited.insert(BB);
  for (auto *SuccBB : successors(BB)) {
    if (!Processed.count(SuccBB) && containsCycle(SuccBB, Visited, Processed))
      return true;
    else if (Visited.count(SuccBB))
      return true;
  }
  Visited.erase(BB);
  return false;
}

what i get is that we will implement dfs by our hand to detect loops so can't we use stack method in dfs that will provide us to not split the function into 2 and keep all the logic inside contains cycle

In D74691#1891862, @omarahmed wrote:

bool containsCycle(BasicBlock *BB, SmallPtrSet<BasicBlock *, 32> &Visited, SmallPtrSet<BasicBlock *, 32> &Processed) {
  if (Processed.count(BB)) {
    Visited.erase(BB);
    return false;
  }
  Processed.insert(BB);
  Visited.insert(BB);
  for (auto *SuccBB : successors(BB)) {
    if (!Processed.count(SuccBB) && containsCycle(SuccBB, Visited, Processed))
      return true;
    else if (Visited.count(SuccBB))
      return true;
  }
  Visited.erase(BB);
  return false;
}

what i get is that we will implement dfs by our hand

Yes

to detect SCCs

Cycles, it's a little bit different.

so can't we use stack method in dfs that will provide us to not split the function into 2 and keep all the logic inside contains cycle

Yes you can. But note that the runtime stack (which is implemented in the hardware) is quite faster than a software stack.
OTOH, a software stack is not as limited to its depth as a software stack. But, given that we pass a single pointer, it should
be able to handle quite a big number of blocks.
It's an implementation detail and not the most important thing. That would be the algorithm. I'd prefer the recursion but we can sure ask @jdoerfert on this.

I am sorry but I can't get what the test that the approach of moving on the SCCs fails in as I have tried to test it a bit with tests like that and printed the SCCs that it gets

https://i.imgur.com/fpZhpST.png
1 - entry
2- condition - body
3- return
4 - l4

https://i.imgur.com/UJJ3aH0.png
1- entry
2- l1 - l2 - l3 - l4 - l5
3- return
4- l6 - l7 - l8

https://i.imgur.com/92NS6rp.png
1- entry
2- l1 - l2 - l3 - l4 - l5 - l6 - l7
3- return

and in all of them it reached the cycle and outputs that the loop does not have a max trip count

In D74691#1892447, @omarahmed wrote:

I am sorry but I can't get what the test that the approach of moving on the SCCs fails in as I have tried to test it a bit with tests like that and printed the SCCs that it gets

https://i.imgur.com/fpZhpST.png
1 - entry
2- condition - body
3- return
4 - l4

https://i.imgur.com/UJJ3aH0.png
1- entry
2- l1 - l2 - l3 - l4 - l5
3- return
4- l6 - l7 - l8

https://i.imgur.com/92NS6rp.png
1- entry
2- l1 - l2 - l3 - l4 - l5 - l6 - l7
3- return

and in all of them it reached the cycle and outputs that the loop does not have a max trip count

Sorry, but I didn't understand your comment very much. The code I wrote certainly can't detect max trip count.
It only correctly (hopefully) finds if there's a cycle or not. The second modification also found the size of the cycle.
Did you write any code that used LI ? In that case, as I said earlier, I have some algorithm in mind that uses LI
but I can't seem to be able to get LI (i.e. it's always nullptr). So, probably the same will be true for you
even if your code is correct. Hopefully we can get an answer about why that happens.

In D74691#1892476, @baziotis wrote:

In D74691#1892447, @omarahmed wrote:

I am sorry but I can't get what the test that the approach of moving on the SCCs fails in as I have tried to test it a bit with tests like that and printed the SCCs that it gets

https://i.imgur.com/fpZhpST.png
1 - entry
2- condition - body
3- return
4 - l4

https://i.imgur.com/UJJ3aH0.png
1- entry
2- l1 - l2 - l3 - l4 - l5
3- return
4- l6 - l7 - l8

https://i.imgur.com/92NS6rp.png
1- entry
2- l1 - l2 - l3 - l4 - l5 - l6 - l7
3- return

and in all of them it reached the cycle and outputs that the loop does not have a max trip count

Sorry, but I didn't understand your comment very much. The code I wrote certainly can't detect max trip count.
It only correctly (hopefully) finds if there's a cycle or not. The second modification also found the size of the cycle.
Did you write any code that used LI ? In that case, as I said earlier, I have some algorithm in mind that uses LI
but I can't seem to be able to get LI (i.e. it's always nullptr). So, probably the same will be true for you
even if your code is correct. Hopefully we can get an answer about why that happens.

I mean the code you suggested that loops on SCCs using SCCiter then it appeared that tarjan only detects maximal SCCs, the one that uses tarjan algorithm :)

In D74691#1892447, @omarahmed wrote:

I am sorry but I can't get what the test that the approach of moving on the SCCs fails in as I have tried to test it a bit with tests like that and printed the SCCs that it gets

https://i.imgur.com/fpZhpST.png
1 - entry
2- condition - body
3- return
4 - l4

https://i.imgur.com/UJJ3aH0.png
1- entry
2- l1 - l2 - l3 - l4 - l5
3- return
4- l6 - l7 - l8

https://i.imgur.com/92NS6rp.png
1- entry
2- l1 - l2 - l3 - l4 - l5 - l6 - l7
3- return

and in all of them it reached the cycle and outputs that the loop does not have a max trip count

Can u share the IR (with the appropriate RUN lines, thus tests) not the images please?

In D74691#1892538, @omarahmed wrote:

In D74691#1892476, @baziotis wrote:

In D74691#1892447, @omarahmed wrote:

I am sorry but I can't get what the test that the approach of moving on the SCCs fails in as I have tried to test it a bit with tests like that and printed the SCCs that it gets

https://i.imgur.com/fpZhpST.png
1 - entry
2- condition - body
3- return
4 - l4

https://i.imgur.com/UJJ3aH0.png
1- entry
2- l1 - l2 - l3 - l4 - l5
3- return
4- l6 - l7 - l8

https://i.imgur.com/92NS6rp.png
1- entry
2- l1 - l2 - l3 - l4 - l5 - l6 - l7
3- return

and in all of them it reached the cycle and outputs that the loop does not have a max trip count

Sorry, but I didn't understand your comment very much. The code I wrote certainly can't detect max trip count.
It only correctly (hopefully) finds if there's a cycle or not. The second modification also found the size of the cycle.
Did you write any code that used LI ? In that case, as I said earlier, I have some algorithm in mind that uses LI
but I can't seem to be able to get LI (i.e. it's always nullptr). So, probably the same will be true for you
even if your code is correct. Hopefully we can get an answer about why that happens.

I mean the code you suggested that loops on SCCs using SCCiter then it appeared that tarjan only detects maximal SCCs, the one that uses tarjan algorithm :)

With that code the reason you don't find any max trip count as I said is that you probably don't have LoopInfo. Running the examples,
I could not get LoopInfo in any way, even with -loops -analyze. That's a problem that we have to fix so that the new algorithm can be improved
with LoopInfo as well.

Could you make sure that you are using a new pass manager? (It means to use ./opt -passes=attributor .. but not ./opt -attributor ..)
If you are using an old pass manager, you might not get LoopInfo or other analysis.

In D74691#1892769, @jdoerfert wrote:

In D74691#1892447, @omarahmed wrote:

I am sorry but I can't get what the test that the approach of moving on the SCCs fails in as I have tried to test it a bit with tests like that and printed the SCCs that it gets

https://i.imgur.com/fpZhpST.png
1 - entry
2- condition - body
3- return
4 - l4

https://i.imgur.com/UJJ3aH0.png
1- entry
2- l1 - l2 - l3 - l4 - l5
3- return
4- l6 - l7 - l8

https://i.imgur.com/92NS6rp.png
1- entry
2- l1 - l2 - l3 - l4 - l5 - l6 - l7
3- return

and in all of them it reached the cycle and outputs that the loop does not have a max trip count

Can u share the IR (with the appropriate RUN lines, thus tests) not the images please?

okay , sorry for this anonymous comment

the first IR :

define i32* @test1(i32* %n0, i32* %r0, i32* %w0) {
entry:
	%tmp = add i32 0, 0
	br label %condition

condition:
	%tobool2 = icmp ne i32* %n0, null
    br i1 %tobool2, label %body, label %return

body:
	%tobool3 = icmp ne i32* %n0, null
    br i1 %tobool3, label %l4, label %condition	

l4:
	br label %l4

return:                                           
  ret i32* %w0
}

the second IR :

define i32* @test2(i32* %n0, i32* %r0, i32* %w0) {
entry:
	br label %l1
l1:
	br label %l2
l2:
	br label %l3
l3:
	%tobool = icmp ne i32* %n0, null
    br i1 %tobool, label %return, label %l4
l4:
	%tobool2 = icmp ne i32* %n0, null
    br i1 %tobool2, label %l6, label %l5
l5:
	br label %l1
l6:
	br label %l7
l7:
	br label %l8
l8:
	br label %l6

return:                                           
  ret i32* %w0
}

the third IR :

define i32* @test3(i32* %n0, i32* %r0, i32* %w0) {
entry:
	br label %l1
l1:
	br label %l2
l2:
	br label %l3
l3:
	%tobool = icmp ne i32* %n0, null
    br i1 %tobool, label %return, label %l4
l4:
	%tobool2 = icmp ne i32* %n0, null
    br i1 %tobool2, label %l6, label %l5
l5:
	br label %l1
l6:
	br label %l7
l7:
	br label %l4

return:                                           
  ret i32* %w0
}

the run command :

./bin/opt -passes=attributor --attributor-disable=false /home/omar/mytest.ll -S

the code I tested it on :

static bool containsUnboundedCycle(Function &F, Attributor &A) {
  bool NoAnalysis = false;

  ScalarEvolution *SE =
      A.getInfoCache().getAnalysisResultForFunction<ScalarEvolutionAnalysis>(F);
  LoopInfo *LI = A.getInfoCache().getAnalysisResultForFunction<LoopAnalysis>(F);
  if (!LI || !SE){
    NoAnalysis = true;
    dbgs() << "no analysis in function\n";
  }

  for (scc_iterator<Function *> It = scc_begin(&F), IE = scc_end(&F); It != IE;
       ++It) {
    dbgs()<<"SCC start\n\n\n";
    for(auto SCCBB : *It){
      dbgs()<< *SCCBB <<"\n";
    }
    bool SCCHasLoop = It.hasLoop();
    if (!SCCHasLoop){
      dbgs()<<"SCC does not have loop\n";
      continue;
    }
    // When NoAnalysis available then check only if the SCC has loop or not.
    if (NoAnalysis && SCCHasLoop){
      dbgs() << "no analysis and does have loop\n";
      return true;
    }

    const std::vector<BasicBlock *> &SCCBBs = *It;
    // If any random block in this SCC does not belong to a loop, then this SCC
    // is definitely not a loop.
    Loop *L = LI->getLoopFor(SCCBBs.front());
    if (!L){
      dbgs() << "loop is null so current SCC does not have loop\n";
      return true;
    }

    // L is the innermost loop that has a common block with the SCC. Since a
    // loop is always an SCC, if their number of blocks are equal, the SCC is a
    // loop so we check if it is bounded or Unbounded loop by checking the
    // maxTripCount. Otherwise, there are 2 cases:
    // - If the SCC has less blocks, then it is definitely not a loop.
    // - If it has more, then we can't decide since the SCC can be a parent loop
    //   of L. So, we perform the same test for the parent of L.
    do {
      if (L->getNumBlocks() > SCCBBs.size()){
        dbgs() << "SCC is less so not loop\n";
        return true;
      }

      if (L->getNumBlocks() == SCCBBs.size())
        if (!SE->getSmallConstantMaxTripCount(L)){
          dbgs() << "there is no max trip count for the current SCC\n";
          return true;
        }
        dbgs() << "loop with max trip count\n";
        break;

    } while ((L = L->getParentLoop()));

    // Check if L is null, we found no loop that matches exactly the number of
    // blocks of the SCC and so the SCC is not a loop.
     if (!L)
      return true;
  }
  return false;
}

and in the three tests it gets me "there is no max trip count for the current SCC\n" and return true
so I what I was asking is that I can't figure out what is the test that would fail in this code ?
more explanation :
I can't figure out this test to try it and see why it is failing in it :)

That means, if a loop has an SCC inside it, we'll never find it since the loop is the maximal SCC.

In D74691#1893012, @uenoku wrote:

Could you make sure that you are using a new pass manager? (It means to use ./opt -passes=attributor .. but not ./opt -attributor ..)
If you are using an old pass manager, you might not get LoopInfo or other analysis.

Thank you mate, I didn't know I was using the old pass manager.

@omarahmed I would advise you to not work anymore on the old algorithm as it's certainly not effective. I'm sorry, but
I barely have time to work on the new algorithm :/ Otherwise, I would try to answer it.

So, here's a new algorithm with a couple of debug prints:

static DenseMap<BasicBlock *, unsigned> VisitingTime;
bool containsCycleUtil(BasicBlock *BB) {
  dbgs() << "Visiting: " << BB->getName() << "\n";
  unsigned VisitingTimeCurrent = VisitingTime[BB];
  for (auto *SuccBB : successors(BB)) {
    dbgs() << BB->getName() << " -> " << SuccBB->getName() << "\n\n";
    unsigned &VisitingTimeSucc = VisitingTime[SuccBB];
    if (!VisitingTimeSucc) {
      VisitingTimeSucc = VisitingTimeCurrent + 1;
      containsCycleUtil(SuccBB);
      VisitingTimeSucc = 0;
    } else {
      dbgs() << "Found cycle: " << VisitingTimeCurrent << ", " << VisitingTimeSucc << "\n";
      assert(VisitingTimeCurrent >= VisitingTimeSucc);
      unsigned CycleSize = VisitingTimeCurrent - VisitingTimeSucc + 1;
      dbgs() << "Cycle size: " << CycleSize << "\n";
    }
  }
  return false;
}

static bool containsCycle(Attributor &A, Function &F) {
  BasicBlock *EntryBB = &F.getEntryBlock();
  VisitingTime[EntryBB] = 1;
  return containsCycleUtil(EntryBB);
}

The main modification compared to the last one is that I track visiting time
instead of going back the recursion. This makes the algorithm so much
cleaner (and reliable, for reasons that are not on topic).
Now, this algorithm, AFAICT, finds all cycles, without false positives. Plus, it detects
only statically reachable cycles. That's good news, it should be strictly better than the initial.

Naively, one could then say "let's apply the same idea as with the SCCs algorithm.
This does not fully work though because check this:

define void @loop_with_if(i1 %c1, i1 %c2) {
entry:
  br label %w1

w1:
  br i1 %c1, label %b1, label %exit

b1:
  br label %if

if:
  br i1 %c2, label %t1, label %e1

t1:
  br label %latch

e1:
  br label %latch

latch:
  br label %w1

exit:
  ret void
}

If you pay attention, it's a while with an if inside. If you draw out the CFG, you can
see that there's a cycle: w1 -> b1 -> if -> e1 -> latch -> w1
which the algorithm will (correctly) find. The cycle size is 5 but the loop size is 6.
So, in such loops, even if SCEV can deduce a max trip count (which is possible),
the initial method will fail to realize this is one loop.

Which makes a point: There can be a cycle, which is not a loop, which still does not
disrupt the maximum trip count guarantee (interestingly, that means that even if
SCCIterator gave us all the SCCs, we would stumble upon this problem).

Edit: Well, that was obvious before but not quite in the same context.

And there can be many many such real-world cases.
From this point on, the problem I think starts to get quite complicated. It seems we
need to start saying "if all the blocks of the cycle belong to the same loop and
these blocks contain the header and the latch of the loop blah.. blah.." then
this cycle can be skipped.

IMHO, we should maybe just live with a not-so-perfect-algorithm, or decide
that it is going to take a good amount of time to get it right.

And sorry for the long posts, but this problem turned out to be more complicated
that it initially seemed. I hope I helped a bit.

In D74691#1894286, @baziotis wrote:
In D74691#1893012, @uenoku wrote:

Could you make sure that you are using a new pass manager? (It means to use ./opt -passes=attributor .. but not ./opt -attributor ..)
If you are using an old pass manager, you might not get LoopInfo or other analysis.

Thank you mate, I didn't know I was using the old pass manager.

@omarahmed I would advise you to not work anymore on the old algorithm as it's certainly not effective. I'm sorry, but
I barely have time to work on the new algorithm :/ Otherwise, I would try to answer it.

So, here's a new algorithm with a couple of debug prints:
static DenseMap<BasicBlock *, unsigned> VisitingTime;
bool containsCycleUtil(BasicBlock *BB) {
  dbgs() << "Visiting: " << BB->getName() << "\n";
  unsigned VisitingTimeCurrent = VisitingTime[BB];
  for (auto *SuccBB : successors(BB)) {
    dbgs() << BB->getName() << " -> " << SuccBB->getName() << "\n\n";
    unsigned &VisitingTimeSucc = VisitingTime[SuccBB];
    if (!VisitingTimeSucc) {
      VisitingTimeSucc = VisitingTimeCurrent + 1;
      containsCycleUtil(SuccBB);
      VisitingTimeSucc = 0;
    } else {
      dbgs() << "Found cycle: " << VisitingTimeCurrent << ", " << VisitingTimeSucc << "\n";
      assert(VisitingTimeCurrent >= VisitingTimeSucc);
      unsigned CycleSize = VisitingTimeCurrent - VisitingTimeSucc + 1;
      dbgs() << "Cycle size: " << CycleSize << "\n";
    }
  }
  return false;
}

static bool containsCycle(Attributor &A, Function &F) {
  BasicBlock *EntryBB = &F.getEntryBlock();
  VisitingTime[EntryBB] = 1;
  return containsCycleUtil(EntryBB);
}
The main modification compared to the last one is that I track visiting time
instead of going back the recursion. This makes the algorithm so much
cleaner (and reliable, for reasons that are not on topic).
Now, this algorithm, AFAICT, finds all cycles, without false positives. Plus, it detects
only statically reachable cycles. That's good news, it should be strictly better than the initial.

Naively, one could then say "let's apply the same idea as with the SCCs algorithm.
This does not fully work though because check this:
define void @loop_with_if(i1 %c1, i1 %c2) {
entry:
  br label %w1

w1:
  br i1 %c1, label %b1, label %exit

b1:
  br label %if

if:
  br i1 %c2, label %t1, label %e1

t1:
  br label %latch

e1:
  br label %latch

latch:
  br label %w1

exit:
  ret void
}
If you pay attention, it's a while with an if inside. If you draw out the CFG, you can
see that there's a cycle: w1 -> b1 -> if -> e1 -> latch -> w1
which the algorithm will (correctly) find. The cycle size is 5 but the loop size is 6.
So, in such loops, even if SCEV can deduce a max trip count (which is possible),
the initial method will fail to realize this is one loop.

Which makes a point: There can be a cycle, which is not a loop, which still does not
disrupt the maximum trip count guarantee (interestingly, that means that even if
SCCIterator gave us all the SCCs, we would stumble upon this problem).

Edit: Well, that was obvious before but not quite in the same context.

And there can be many many such real-world cases.
From this point on, the problem I think starts to get quite complicated. It seems we
need to start saying "if all the blocks of the cycle belong to the same loop and
these blocks contain the header and the latch of the loop blah.. blah.." then
this cycle can be skipped.

IMHO, we should maybe just live with a not-so-perfect-algorithm, or decide
that it is going to take a good amount of time to get it right.

And sorry for the long posts, but this problem turned out to be more complicated
that it initially seemed. I hope I helped a bit.

I now understand the problem clearly , Thank you really , u helped me a lot :)
I will try to submit a diff tmw with this new algorithm with adding a doc :)

I now understand the problem clearly , Thank you really , u helped me a lot :)

No problem! If you have any questions, please ask.

I will try to submit a diff tmw with this new algorithm with adding a doc :)

You may want to submit it initially without the LoopInfo.

I tried to use loopInfo on the algorithm , I have came with 2 approaches :
the first is that if we found a cycle we add the BB in the cycle in a allCyclesBBs vector and after we return from containCycleUtil with all the BBs that makes cycles then loop on the allCyclesBBs vector in the containsCycle function and get the information of the loop with getloopfor as we was doing or ask this inside the containCycleUtil and make this function return us if there was a cycle with no maxTripCount
the second is that we do this dfs in a stack style and if we find a cycle we ask for loop info and do the same checks as before like maxtripCount
I don't know which of them is better ?

also the algorithm sometimes in cases moves on BBs 2 times like this case here :

define i32* @external_sink_ret2_nrw(i32* %n0, i32* %r0, i32* %w0) {
entry:
  %tobool = icmp ne i32* %n0, null
  br i1 %tobool, label %if.end, label %if.then

if.then:                                          ; preds = %entry
  %tobool2 = icmp ne i32* %n0, null
  br i1 %tobool2, label %return, label %if.end

if.end:                                           ; preds = %entry
  br label %return

return:                                           ; preds = %if.end, %if.then
  ret i32* %w0
}

so I think if we keep not only the visiting time but the start time = visitingTime and finish time of each node we have visited then we can detect a cycle if we reach a node with no finishTime that means that this node we have visited but not returned from it to make it finish , so this makes the complexity in all the situations O( N+E )

another question Is it normal to make global variables as I have not found any in the code ?

In D74691#1898708, @omarahmed wrote:

I tried to use loopInfo on the algorithm , I have came with 2 approaches :
the first is that if we found a cycle we add the BB in the cycle in a allCyclesBBs vector and after we return from containCycleUtil with all the BBs that makes cycles then loop on the allCyclesBBs vector in the containsCycle function and get the information of the loop with getloopfor as we was doing or ask this inside the containCycleUtil and make this function return us if there was a cycle with no maxTripCount
the second is that we do this dfs in a stack style and if we find a cycle we ask for loop info and do the same checks as before like maxtripCount
I don't know which of them is better ?

The second one unless I miss something. Oh, I just saw I didn't post the code with LoopInfo. But in any case,
you don't need to save the cycles for any reason cycles.

also the algorithm sometimes in cases moves on BBs 2 times

You probably mean the return label. Yes it does, because you try to find all cycles. Compare this to the initial algorithm in which you find if there's any cycle at all (check the Processed there that is missing here).

like this case here :
define i32* @external_sink_ret2_nrw(i32* %n0, i32* %r0, i32* %w0) {
entry:
  %tobool = icmp ne i32* %n0, null
  br i1 %tobool, label %if.end, label %if.then

if.then:                                          ; preds = %entry
  %tobool2 = icmp ne i32* %n0, null
  br i1 %tobool2, label %return, label %if.end

if.end:                                           ; preds = %entry
  br label %return

return:                                           ; preds = %if.end, %if.then
  ret i32* %w0
}
so I think if we keep not only the visiting time but the start time = visitingTime and finish time of each node we have visited then we can detect a cycle if we reach a node with no finishTime that means that this node we have visited but not returned from it to make it finish , so this makes the complexity in all the situations O( N+E )

another question Is it normal to make global variables as I have not found any in the code ?

Generally, global variables are bad. However, when doing recursion, you don't want to pass parameters that are invariant because the fill the stack without a reason
and limit the recursion depth. So, you have 2 choices. Either you make global variables or you make a class that does the thing you want to do.
Then, the global variables get replaced by member variables. To me, this is mostly a matter of preference. Indeed, there are no global variables in the Attributor generally
but in this case, I don't know what would be preferred. This is the easiest thing to replace, so you can use them for now if you want.
In any case, this patch got an unexpected turn so I'd wait for @jdoerfert to tell us an opinion in general. Give him a little a bit of time since:
a) There's a lot to read.
b) He's travelling.

I haven't read all the comments but the code looks good with one problem, see below.

llvm/lib/Transforms/IPO/Attributor.cpp
2378	Nit: `bool NoAnalysis = !SE \|\| !LI;`
2396	if (!L \|\| !SE->getSmallConstantMaxTripCount(L)) return true; You ignore non-loop cycles otherwise. Please add a non-loop cycle test case, aka. irreducible control flow test case. Here is a piece of C that should help create the test: if (foo()) goto entry1; else goto entry2; entry1: if (foo()) goto exit; else goto entry2; entry2: if (foo()) goto exit; else goto entry1; exit: return;

baziotis added inline comments.Mar 3 2020, 2:07 PM

llvm/lib/Transforms/IPO/Attributor.cpp
2396	Please note that this code (i.e. the original) is wrong anyway, as for example in this it finds a cycle because we don't mark nodes unvisited when trying for a new cycle: define i32* @external_sink_ret2_nrw(i32* %n0, i32* %r0, i32* %w0) { entry: %tobool = icmp ne i32* %n0, null br i1 %tobool, label %if.end, label %if.then if.then: ; preds = %entry br label %return if.end: ; preds = %entry %0 = load i32, i32* %r0, align 4 store i32 %0, i32* %w0, align 4 br label %return return: ; preds = %if.end, %if.then ret i32* %w0 } As I noted in earlier comments, to find all cycles correctly, one has to visit but then unvisit nodes when trying a new cycle. This eventually led to new algorithms which did full cycle detection and I ended up with this algorithm with the caveats mentioned: https://reviews.llvm.org/D74691/new/#1894286

Please add a non-loop cycle test case, aka. irreducible control flow test case.

okay I will add it in the next diff :)

jdoerfert added inline comments.Mar 4 2020, 1:49 PM

llvm/lib/Transforms/IPO/Attributor.cpp
2396	Not wrong but too conservative ;) That's a big difference. However, I agree, we should be as precise as is reasonably cheap.

baziotis added inline comments.Mar 4 2020, 3:08 PM

llvm/lib/Transforms/IPO/Attributor.cpp
2396	If you mean that the actual problem is the speed which is a side-effect of conservation, then yes, definitely it's unreasonably slow (given a complete graph, it has sth like exponential time). Other than that, the fact that a block of a cycle is inside a loop does not mean that the cycle is this loop. Maybe we can use it if the block is special and it has this specific property. For example, if we knew that the header/latch/... of a loop can't be part of a non-loop cycle. I don't know if something like that holds, I'll try to check tomorrow.

omarahmed updated this revision to Diff 248394.Mar 4 2020, 10:22 PM

[Attributor]Detect possibly unbounded cycles in functions

This patch changes the approach of containsCycle function of looping on
the CFG by depth_first to another approach as the depth_first function approach detects if-then-else part in
functions as a loop and that result in wrong behaviour, so the new approach detects cycles by using a recursive function that dfs on the CFG from the entry block and detects a cycle when we go in a recursive calls chain and visit one of the nodes of the chain.
This new approach also saves the processed nodes so not compute them two times thus enhancing the complexity of the algorithm to be O(N+E).
The approach also doesn't visit statically unreachable blocks so that also enhances the complexity.
This patch also uses maxTripCount to detect if a function has a cycle,then we check
further whether the cycle is a bounded loop cycle.
Loops with maximum trip count are considered bounded, any other cycle not.
It also contains some fixed tests and some added tests contain bounded and
unbounded loops.

[Attributor]Detect possibly unbounded cycles in functions

Harbormaster completed remote builds in B48152: Diff 248394.Mar 4 2020, 11:31 PM

Harbormaster completed remote builds in B48153: Diff 248395.Mar 5 2020, 12:03 AM

In D74691#1906930, @omarahmed wrote:

[Attributor]Detect possibly unbounded cycles in functions

so the new approach detects cycles by using a recursive function that dfs on the CFG

Note that this algorithm detects whether there is a cycle in the function (correctl). But now
note that if you find a cycle, you may want to skip it. In that case, it is wrong.
Specifically, because of the Processed, say you find a cycle A and you skip it
because it is a loop with max trip count (how do you do that reliably is another story).
There can be a cycle B (in which some nodes of A are contained) that won't be found.
That is a false negative (i.e. we'll say the function is willreturn but it might
not be). I think this is a problem.
I'll have to think if that can happen in the CFG, but from the graph theory perspective,
it definitely can.

This new approach also saves the processed nodes so not compute them two times thus enhancing the complexity of the algorithm to be O(N+E).
The approach also doesn't visit statically unreachable blocks so that also enhances the complexity.

Those 2 were true for the previous algorithm as well. If I'm not mistaken, we're doing strictly more work than the previous algorithm (although
the facts you wrote are correct).

llvm/lib/Transforms/IPO/Attributor.cpp
2393	Note that as I mentioned, if a node of a cycle is within a loop, it doesn't mean it is the loop.

In D74691#1907179, @baziotis wrote:

In D74691#1906930, @omarahmed wrote:

[Attributor]Detect possibly unbounded cycles in functions

so the new approach detects cycles by using a recursive function that dfs on the CFG

Note that this algorithm detects whether there is a cycle in the function (correctl). But now
note that if you find a cycle, you may want to skip it. In that case, it is wrong.
Specifically, because of the Processed, say you find a cycle A and you skip it
because it is a loop with max trip count (how do you do that reliably is another story).
There can be a cycle B (in which some nodes of A are contained) that won't be found.
That is a false negative (i.e. we'll say the function is willreturn but it might
not be). I think this is a problem.
I'll have to think if that can happen in the CFG, but from the graph theory perspective,
it definitely can.

This new approach also saves the processed nodes so not compute them two times thus enhancing the complexity of the algorithm to be O(N+E).
The approach also doesn't visit statically unreachable blocks so that also enhances the complexity.

Those 2 were true for the previous algorithm as well. If I'm not mistaken, we're doing strictly more work than the previous algorithm (although
the facts you wrote are correct).

From what i understaand is that when we have a case like that https://i.imgur.com/cvAjcMa.png and when cycle 1-2-3 have amax trip count and we visit them first we have a problem
but i think it won't make a problem since 2 and 3 which are in the bigger cycle must have edges in the cycle in here is the edge of 3 to 4 so 3 and 2 will not be marked processed until they completly finish the 3 node successors so it will complete in the bigger cycle and detect the cycle . I think marking processed at the end will make us escape from a problem like that

In D74691#1907574, @omarahmed wrote:

In D74691#1907179, @baziotis wrote:

In D74691#1906930, @omarahmed wrote:

[Attributor]Detect possibly unbounded cycles in functions

so the new approach detects cycles by using a recursive function that dfs on the CFG

Note that this algorithm detects whether there is a cycle in the function (correctl). But now
note that if you find a cycle, you may want to skip it. In that case, it is wrong.
Specifically, because of the Processed, say you find a cycle A and you skip it
because it is a loop with max trip count (how do you do that reliably is another story).
There can be a cycle B (in which some nodes of A are contained) that won't be found.
That is a false negative (i.e. we'll say the function is willreturn but it might
not be). I think this is a problem.
I'll have to think if that can happen in the CFG, but from the graph theory perspective,
it definitely can.

This new approach also saves the processed nodes so not compute them two times thus enhancing the complexity of the algorithm to be O(N+E).
The approach also doesn't visit statically unreachable blocks so that also enhances the complexity.

Those 2 were true for the previous algorithm as well. If I'm not mistaken, we're doing strictly more work than the previous algorithm (although
the facts you wrote are correct).

From what i understaand is that when we have a case like that https://i.imgur.com/cvAjcMa.png and when cycle 1-2-3 have amax trip count and we visit them first we have a problem
but i think it won't make a problem since 2 and 3 which are in the bigger cycle must have edges in the cycle in here is the edge of 3 to 4 so 3 and 2 will not be marked processed until they completly finish the 3 node successors so it will complete in the bigger cycle and detect the cycle . I think marking processed at the end will make us escape from a problem like that

In such scenarios you won't have a problem indeed. What I meant is such scenarios: https://imgur.com/a/Ua8xwhg
Consider the case where you find the cycle 0-1-2-0 but it is a loop with max trip count. Upon returning (from recursion) in 1, you will visit 3. But from 3, you won't visit 1 because it's already processed.
If you're doing cycle detection (i.e. check if a graph has a cycle whatsoever), this is correct because there's no point revisiting processed nodes. But this is not our case because we skip.

So, to sum up: As I mentioned, LoopInfo I think can't be used reliably as it is used now for the points mentioned previously. And we have 2 cases:

If we find a reliable way to skip cycles, in this algorithm we will have false negatives which I think is bad. Maybe better the initial version which while it had false positives, it didn't have false negatives.
If not, this is probably as good as we can get in cycle detection and has a good complexity.

High level comments:

Please do not use global variables. If you need a set/map, create it on the stack and pass it as a reference.
If the algorithm to distinguish cycles and loops makes problems, we can just go through all loops in loop info and verify there is no irreducible control (see for example mayContainIrreducibleControl in MustExecute.cpp)

In D74691#1908641, @jdoerfert wrote:

If the algorithm to distinguish cycles and loops makes problems, we can just go through all loops in loop info and verify there is no irreducible control (see for example mayContainIrreducibleControl in MustExecute.cpp)

That's a great idea! @omarahmed note that LoopInfo gives you the top-level loops (https://llvm.org/docs/LoopTerminology.html#id3). One way to get top-level loops is getLoopsInPreorder(). Then, you can get all the sub-loops of a loop using getSubLoops(). However, note that this will give you the immediate children if I'm not mistaken. That is, say you have a top-level loop A that contains B that contains C. A->getSubLoops() will give you B. You then have to do B->getSubLoops() to get C.
Before you do all that, you check for irreducible control of course.

Other than that, and you may skip that, I think it's interesting to see what happens behind the scenes. mayContainIrreducibleControl() calls containsIrreducibleCFG().
This requires LoopInfo. Apart from its implementation that is interesting, it's also educational to read about how do you test for irreducible control in a CFG (i.e. cycle that is not a Loop in the LLVM terms) without LoopInfo.
Here's the diff in which it was added: https://reviews.llvm.org/D40874#1013599 . Check online for T1 / T2 transformations in control-flow graphs (I just did 'cause of course I didn't know any of that and it's quite fun).

In D74691#1908762, @baziotis wrote:

In D74691#1908641, @jdoerfert wrote:

If the algorithm to distinguish cycles and loops makes problems, we can just go through all loops in loop info and verify there is no irreducible control (see for example mayContainIrreducibleControl in MustExecute.cpp)

That's a great idea! @omarahmed note that LoopInfo gives you the top-level loops (https://llvm.org/docs/LoopTerminology.html#id3). One way to get top-level loops is getLoopsInPreorder(). Then, you can get all the sub-loops of a loop using getSubLoops(). However, note that this will give you the immediate children if I'm not mistaken. That is, say you have a top-level loop A that contains B that contains C. A->getSubLoops() will give you B. You then have to do B->getSubLoops() to get C.
Before you do all that, you check for irreducible control of course.

Other than that, and you may skip that, I think it's interesting to see what happens behind the scenes. mayContainIrreducibleControl() calls containsIrreducibleCFG().
This requires LoopInfo. Apart from its implementation that is interesting, it's also educational to read about how do you test for irreducible control in a CFG (i.e. cycle that is not a Loop in the LLVM terms) without LoopInfo.
Here's the diff in which it was added: https://reviews.llvm.org/D40874#1013599 . Check online for T1 / T2 transformations in control-flow graphs (I just did 'cause of course I didn't know any of that and it's quite fun).

okay will work on it right away and see how this goes :) , Thanks for pointing me to this great resources will definitely check all of them :D

[Attributor] Detect possibly unbounded cycles in functions

This patch changes the approach of containsCycle function of looping on
the CFG by depth_first to another approach as the depth_first function approach detects if-then-else part in
functions as a loop and that result in wrong behaviour, so this approach checks first the analysis of the function if no analysis available for the function then it loops on the SCCs and checkss if the SCC is a cycle or not,this is done with the assumption that any cycle is unbounded.
If analysis is present for the function then we check for irreducible control to check if there is a non loop cycles if there weren't any then we loop on the loop cycles in the function and check the maxTripCount of the loop.
This patch also uses maxTripCount to check if a function have bounded loop cycles.
Loops with maximum trip count are considered bounded, any other cycle not.
It also contains some fixed tests and some added tests contain bounded and
unbounded loops and non-loop cycles.

[Attributor] Detect possibly unbounded cycles in functions

This patch changes the approach of containsCycle function of looping on
the CFG by depth_first to another approach as the depth_first function approach detects if-then-else part in
functions as a loop and that result in wrong behaviour, so this approach checks first the analysis of the function if no analysis available for the function then it loops on the SCCs and checkss if the SCC is a cycle or not,this is done with the assumption that any cycle is unbounded.
If analysis is present for the function then we check for irreducible control to check if there is a non loop cycles if there weren't any then we loop on the loop cycles in the function and check the maxTripCount of the loop.
This patch also uses maxTripCount to check if a function have bounded loop cycles.
Loops with maximum trip count are considered bounded, any other cycle not.
It also contains some fixed tests and some added tests contain bounded and
unbounded loops and non-loop cycles.

Harbormaster failed remote builds in B48475: Diff 248960!Mar 7 2020, 1:33 PM

In D74691#1911429, @omarahmed wrote:

[Attributor] Detect possibly unbounded cycles in functions

This patch changes the approach of containsCycle function of looping on
the CFG by depth_first to another approach as the depth_first function approach detects if-then-else part in
functions as a loop and that result in wrong behaviour, so this approach checks first the analysis of the function if no analysis available for the function then it loops on the SCCs and checkss if the SCC is a cycle or not,this is done with the assumption that any cycle is unbounded.

You don't need anything above here. :)

If analysis is present for the function then we check for irreducible control to check if there is a non loop cycles if there weren't any then we loop on the loop cycles in the function and check the maxTripCount of the loop.

It would be great to fix grammar / syntax a little bit. You can break sentences to make this easier. Also, you can be a little more specific (when it is useful). For example:
If LoopInfo and SCEV analyses are available, then:

We can test whether there is irreducible control (i.e. non-loop cycles). If there is, one of those cycles can be infinite, so we return true.
Otherwise, loop through all the loops in the function (including child loops) and test whether any of those does not have a maximal trip count (in which case it might be infinite).

Note that the first step can be done more cheaply without requiring LoopInfo. But we use it since it is implicitly needed in the second step.

It also contains some fixed tests and some added tests contain bounded and
unbounded loops and non-loop cycles.

This is good, you can leave that.

Aside from the description, we have to do something when analyses are not available. My proposal is to use the old method, again because it doesn't have false negatives and it's reasonably cheap.
Lastly, I want to reiterate that if there's anything that is unclear, please ask.

llvm/lib/Transforms/IPO/Attributor.cpp
2376–2382	You don't need any of that. This is what was mentioned in a previous diff: Specifically that SCC iterator does not loop through all SCCs, only the top-level one, so you can't do that.
2395–2401	As @jdoerfert mentioned, you can replace this with mayContainIrreducibleControl
2405–2406	As I said in a previous comment, `getLoopsInPreorder()` (or any way that `LoopInfo` gives you access to all the loops) gives you the top-level loops. The fact that a top-level loop has a maximal trip count does not mean that its children loops (i.e. loops entirely contained within this loop) will also have a maximal trip count. So, for every top-level loop, you have to check all its children (i.e. `getSubLoops()`). And all that, in a loop / recursion, because the sub loops may themselves have children loops etc.

Harbormaster failed remote builds in B48478: Diff 248963!Mar 7 2020, 2:05 PM

omarahmed marked 3 inline comments as done.Mar 7 2020, 5:28 PM

omarahmed added inline comments.

llvm/lib/Transforms/IPO/Attributor.cpp
2376–2382	I tried to do the old method but with no false positives, from what I understand here we wanted only to detect if a cycle exists only and doesn't care if it is a loop or not so I thought that SCCs evenwhen they will give us the top level ones these top level ones will mean that there is a cycle and if there were not these top level SCC structure and it is only consists of single nodes SCCs it will make us not detect a cycle so i used hasCycle function to detect that , does a top level SCC can exist that is not a cycle and a smaller SCC that is a cycle exist in it ?
2395–2401	I found that the function is in mustExcute.cpp only and not available in the header file mustExcute.h so how could i use it in this case i guess it should be added to the header :)
2405–2406	I have tried it with a test like that for(int i=0;i<n;i++) for(int j =0;j<n;j++) for(int k = 0;k<n;k++) for(int l = 0;l<n;l++) and printed the loops that it iterate on and i found it had iterated even on the forth level one , also I have looked on the implementation of the cycle and found that it recursivly was going inside the loops so i guess it goes throw all of the loops not only the top-level ones or I miss something here ?

omarahmed marked an inline comment as done.Mar 7 2020, 5:52 PM

omarahmed added inline comments.

llvm/lib/Transforms/IPO/Attributor.cpp
2405–2406	I mean the implementation of the function not cycle it is a typo :)

baziotis added inline comments.Mar 8 2020, 1:37 AM

llvm/lib/Transforms/IPO/Attributor.cpp
2376–2382	Sorry, I wrote these comments quite quickly yesterday. You're right. To answer, at first, no, any SCC unless it is single-node without self-loop. Now, `scc_iterator` uses Tarjan, which finds all the maximal SCCs. Those are SCCs which are as big as they can get. i.e. in which you can't put any other node and still have an SCC. To detect if there's a cycle, you only need to find the maximal ones. Also, note that in this case, you could use the algorithm version with `Processed`. This checked whether there is a cycle in the graph in the same complexity as Tarjan, but it's also simpler. Use what you like. However, if you use the `scc_iterator` version, we need a good explanation (you can pick stuff from this comment).
2395–2401	Yep, true, it's `static`. Please add it to the header. :)
2405–2406	Yes, also true, it gives you all the loops in preorder, I just checked the function. So, yes that should work.

bbn added a subscriber: bbn.Mar 8 2020, 8:39 AM

omarahmed marked an inline comment as done.Mar 8 2020, 1:06 PM

omarahmed added inline comments.

llvm/lib/Transforms/IPO/Attributor.cpp
2395–2401	i tried to erase static as static means that it will only used in the cpp file also i have put the definition in the header file but that resulted in an error in linking so how could we do that here ? I guess also that all the functions i saw was only static in the cpp file or they are belong to a class in the .h or cpp file but no function that is implemented alone so I think that getting this 3 lines is more clean or what do you think ?

Patch updates :

Check If LoopInfo and SCEV analyses are available, then:

We can test whether there is irreducible control (i.e. non-loop cycles). If there is, one of those cycles can be infinite, so we return true.
Otherwise, loop through all the loops in the function (including child loops) and test whether any of those does not have a maximal trip count (in which case it might be infinite).

Else if unavailable:

We check if any cycle exists then we return true or else we return false.we use scc_iterator to check for cycles.

Note:

scc_iterator uses Tarjan, which finds all the maximal SCCs. Those are SCCs which are as big as they can get. i.e. in which you can't put any other node and still have an SCC. To detect if there's a cycle, you only need to find the maximal ones.

Harbormaster failed remote builds in B48509: Diff 249021!Mar 8 2020, 5:03 PM

why it is failing in harbor master remote builds,I am sure it compiles , passes tests and clang format run on it?
also regarding the tests it gives expected fails =1 in liveness.ll , from what i understand it is expected to fail anyway but is that a problem that needs to be fixed ?

In D74691#1911863, @omarahmed wrote:

Patch updates :

Check If LoopInfo and SCEV analyses are available, then:

We can test whether there is irreducible control (i.e. non-loop cycles). If there is, one of those cycles can be infinite, so we return true.

Otherwise, loop through all the loops in the function (including child loops) and test whether any of those does not have a maximal trip count (in which case it might be infinite).

Else if unavailable:

We check if any cycle exists then we return true or else we return false.we use scc_iterator to check for cycles.

Note:

scc_iterator uses Tarjan, which finds all the maximal SCCs. Those are SCCs which are as big as they can get. i.e. in which you can't put any other node and still have an SCC. To detect if there's a cycle, you only need to find the maximal ones.

That's quite better I think! :) Side note: You can rephrase or change my proposals regarding documentation, commit messages etc. I'm here only to try to make sure that you get the point across (you're also
free to copy exactly what I say of course, pick whatever you like).

why it is failing in harbor master remote builds,I am sure it compiles , passes tests and clang format run on it?

On that part unfortunately I can't help. I'm not familiar with harbomaster. AFAIK, it shouldn't fail. Maybe Johannes or @uenoku can help.

also regarding the tests it gives expected fails =1 in liveness.ll , from what i understand it is expected to fail anyway but is that a problem that needs to be fixed ?

AFAIK, no, this is not a problem. But make sure that you run the whole test suite. That is, probably now you run the tests in llvm/test/Transforms/Attributor/. After verifying that those pass,
you can run all the tests in llvm/test. Note that this should not make a difference since the Attributor is not currently used anywhere. But it's something that should be done in general.

llvm/lib/Transforms/IPO/Attributor.cpp
2395–2401	I think it's better if we use the already made function. I'll try to do that myself tomorrow and see what problems come up. [Irrelevant] Some more info about `static`: It means that you can use the function only in this compilation unit. Usually, a compilation unit is a `.cpp` file, but if you include a `.cpp` file in another, those 2 will make one compilation unit. As a side note, `static` has more to do with linking than with compiling. It basically means that the symbol (i.e. the function, which in assembly is just a label) is not exported. So, no code from another object file can jump to it. Here's a great video in case you're interested in this kind of stuff: https://www.youtube.com/watch?v=n4fI4eUTTKM

baziotis added inline comments.Mar 8 2020, 7:32 PM

llvm/lib/Transforms/IPO/Attributor.cpp
2405	After a little bit of search, `clang-tidy` fails in the Harbomaster because you can declare `L` as `auto *L` (instead of `auto L`). You can see the failed check by clicking on the Harbomaster failed build (https://reviews.llvm.org/B48509), then "pre-merge checks"(https://reviews.llvm.org/harbormaster/build/52266/), then "External Link clang-tidy linux" (https://results.llvm-merge-guard.org/amd64_debian_testing_clang8-4435/clang-tidy.txt).

omarahmed marked an inline comment as done.Mar 8 2020, 7:56 PM

omarahmed added inline comments.

llvm/lib/Transforms/IPO/Attributor.cpp
2405	oh sry that was obvious somehow and i didn't saw it,Thanks :)

baziotis added inline comments.Mar 9 2020, 6:01 AM

llvm/lib/Transforms/IPO/Attributor.cpp
2395–2401	So, I had forgotten about `namespace llvm`. You need to put the declaration of the function inside `namespace llvm` in `MustExecute.h`. Then, in `MustExecute.cpp`, you need to remove `static` but add `llvm::` in the name, i.e. `llvm::mayContainIrreducibleControl(...`. In short, the function has to be declared inside the llvm namespace and when you define it, you have to specify that you want to define a function from this specific namespace.

uenoku added inline comments.Mar 9 2020, 9:09 AM

llvm/lib/Transforms/IPO/Attributor.cpp
2378	nit: Normally, the word "strongly connected components" contains its maximality. So I think it's more concise to omit the description for maximality.

baziotis added inline comments.Mar 9 2020, 10:12 AM

llvm/lib/Transforms/IPO/Attributor.cpp
2378	I'm not sure there's a definitive answer here. It depends on the definition. If we use the "reachability" definition, then it's not implied. An SCC can contain an SCC. But if we talk about plural, i.e. SCCs of a graph, the definition contains the term partition. Which implies maximality. I've seen both in books and I believe it's a long discussion in which we don't want to get the reader to. The fact that I didn't remember that `scc_iterators` finds maximal SCCs led to 3-4 days at least of working on a wrong solution (in this patch actually).

Patch updates :

Add mayContainIrreducibleControl() function to the MustExecute.h so it could be used in mayContainUnboundedCycle() function in Attributtor.cpp.
Fix mayContainIrreducibleControl() function as the return value was reversed , it was :
- Return true if Reducible control.
- Return false if Irreducible control.
Minor fixes in docs.

I don't know if reversing the condition of mayContainIrreducibleControl is right but it seemed logical to me also I have run llvm\test\analysis tests and it didn't break any test

Harbormaster failed remote builds in B48593: Diff 249196!Mar 9 2020, 2:03 PM

Patch updates :

Add mayContainIrreducibleControl() function to the MustExecute.h so it could be used in mayContainUnboundedCycle() function in Attributtor.cpp.
Fix mayContainIrreducibleControl() function as the return value was reversed , it was :
- Return true if Reducible control.
- Return false if Irreducible control.
Minor fixes in docs.

Harbormaster completed remote builds in B48628: Diff 249263.Mar 9 2020, 7:26 PM

Generally it LGTM. I've added some nit comments. Before accepting, I'd like @jdoerfert opinion on the change in mayContainIrreducibleControl().
I think it's a correct change.

llvm/lib/Transforms/IPO/Attributor.cpp
2369–2372	Remove the TODO. Then please see the comment above.
2376	"If NoAnalysis is available ..." -> "If either SCEV or LoopInfo is not available ..."
2379	"you only need" -> "we only need"
2395	If there's irreducible control, the function may contain non-loop cycles.
2400	Any loop that does not have a max trip count is considered unbounded cycle.

Remove the TODO.
Then please see the comment above.

sry , I don't understand what u mean by the comment above , does u mean your comment to uenoko ?

uenoku added inline comments.Mar 10 2020, 8:52 AM

llvm/lib/Transforms/IPO/Attributor.cpp
2378	Ok, I'm fine with either way :)

In D74691#1914776, @omarahmed wrote:

Remove the TODO.
Then please see the comment above.

sry , I don't understand what u mean by the comment above , does u mean your comment to uenoko ?

No sorry, I meant this:

Here, you want something clearer to what the function does. I'd say, you can go with:
"Helper function that checks whether a function has any cycle which we don't know is bounded.
Loops with maximum trip count are considered bounded, any other cycle not."
We may have to revise this further but it seems a good start.

llvm/lib/Transforms/IPO/Attributor.cpp
2378	Cool :)

Minor cleanups

Harbormaster completed remote builds in B48707: Diff 249428.Mar 10 2020, 10:53 AM

Harbormaster completed remote builds in B48706: Diff 249425.

I added a single nit comment. Otherwise it LGTM! :)

llvm/lib/Transforms/IPO/Attributor.cpp
2372–2391	[Nit][Grammar] "we don't know is bounded" -> "we don't know if it is bounded"

Minor cleanup

In D74691#1915859, @baziotis wrote:

I added a single nit comment. Otherwise it LGTM! :)

Thanks for the great reviews and bearing my small mistakes through this :)

Harbormaster completed remote builds in B48758: Diff 249520.Mar 10 2020, 5:33 PM

In D74691#1915962, @omarahmed wrote:

In D74691#1915859, @baziotis wrote:

I added a single nit comment. Otherwise it LGTM! :)

Thanks for the great reviews and bearing my small mistakes through this :)

No problem, I hope it helped. :) Let's wait for a couple of days for other reviewers and then I can commit.

Although I just saw we should also change the commit message (i.e. the description).
The following is a quick write that seems fine.

Add mayContainUnboundedCycle which checks if there's a cycle
in a function that we consider unbounded.
All loops with max trip count are considered bounded.
Any other cycle is not.

Update commit message

In D74691#1916706, @omarahmed wrote:

Update commit message

Sorry probably it wasn't clear. I meant that you should also update the description of this patch. :)

Harbormaster completed remote builds in B48798: Diff 249592.Mar 11 2020, 5:44 AM

In D74691#1916738, @baziotis wrote:

In D74691#1916706, @omarahmed wrote:

Update commit message

Sorry probably it wasn't clear. I meant that you should also update the description of this patch. :)

done :) , Does we need more details than that ?

omarahmed edited the summary of this revision. (Show Details)Mar 11 2020, 6:15 AM

In D74691#1916782, @omarahmed wrote:

In D74691#1916738, @baziotis wrote:

In D74691#1916706, @omarahmed wrote:

Update commit message

Sorry probably it wasn't clear. I meant that you should also update the description of this patch. :)

done :) , Does we need more details than that ?

I think it is ok. Although I haven't seen much the tests. Let's wait a couple of days for another reviewer and we can commit then.

LGTM

LGTM, I'll commit

This revision is now accepted and ready to land.Mar 13 2020, 7:53 AM

Closed by commit rGb285b333dcd0: [Attributor] Detect possibly unbounded cycles in functions (authored by omarahmed, committed by jdoerfert). · Explain WhyMar 13 2020, 9:40 AM

This revision was automatically updated to reflect the committed changes.

Thanks all :D
I am sorry but a message sent to me with this

The Buildbot has detected a new failure on builder fuchsia-x86_64-linux while building llvm.

So does that mean i broke something here ?

You don't have to care about the failure if the failure doesn't occur multiple times.
Please see https://lists.llvm.org/pipermail/llvm-dev/2019-November/136710.html

In D74691#1921957, @uenoku wrote:

You don't have to care about the failure if the failure doesn't occur multiple times.
Please see https://lists.llvm.org/pipermail/llvm-dev/2019-November/136710.html

I think a better way to put it is that you don't have to care about the failure if you did not cause it (e.g. you might not get multiple emails if other bots were broken before)

In D74691#1922032, @fhahn wrote:

In D74691#1921957, @uenoku wrote:

You don't have to care about the failure if the failure doesn't occur multiple times.
Please see https://lists.llvm.org/pipermail/llvm-dev/2019-November/136710.html

I think a better way to put it is that you don't have to care about the failure if you did not cause it (e.g. you might not get multiple emails if other bots were broken before)

Yes, thank you for the correction.

Got it, Thanks :)

Revision Contents

Path

Size

llvm/

lib/

Transforms/

IPO/

Attributor.cpp

66 lines

test/

Transforms/

Attributor/

read_write_returned_arguments_scc.ll

5 lines

Diff 245525

llvm/lib/Transforms/IPO/Attributor.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 2,360 Lines • ▼ Show 20 Lines	STATS_DECL(UndefinedBehaviorInstruction, Instruction,
"Number of instructions known to have UB");		"Number of instructions known to have UB");
BUILD_STAT_NAME(UndefinedBehaviorInstruction, Instruction) +=		BUILD_STAT_NAME(UndefinedBehaviorInstruction, Instruction) +=
KnownUBInsts.size();		KnownUBInsts.size();
}		}
};		};

/// ------------------------ Will-Return Attributes ----------------------------		/// ------------------------ Will-Return Attributes ----------------------------

// Helper function that checks whether a function has any cycle.		// Helper function that checks whether a function has any cycle and check if
lebedev.riUnsubmitted Not Done Reply Inline Actions The old comment is strictly more understandable. New comment does not mention at all that we are checking inside functions only. lebedev.ri: The old comment is strictly more understandable. New comment does not mention at all that we…
baziotisUnsubmitted Not Done Reply Inline Actions Yeah, the addition that we check inside a function would be good. Otherwise, I think the new one is clearer on what it is considered an unbounded cycle in this case. baziotis: Yeah, the addition that we check inside a function would be good. Otherwise, I think the new…
		// the cycle have an upper bound or not.
		baziotisUnsubmitted Not Done Reply Inline Actions I think this doc is a little bit misleading. Maybe better `Helper function that checks whether a function has any unbounded cycle`. Actually, given that you change the name to what Johannes proposed, an even better doc I think is: `An unbounded cycle in this case is either a loop that does not have a maximal trip count or any other cycle`. baziotis: I think this doc is a little bit misleading. Maybe better `Helper function that checks whether…
// TODO: Replace with more efficent code		// TODO: Replace with more efficent code
		baziotisUnsubmitted Not Done Reply Inline Actions Here, you want something clearer to what the function does. I'd say, you can go with: "Helper function that checks whether a function has any cycle which we don't know is bounded. Loops with maximum trip count are considered bounded, any other cycle not." We may have to revise this further but it seems a good start. baziotis: Here, you want something clearer to what the function does. I'd say, you can go with: "Helper…
static bool containsCycle(Function &F) {		static bool containsCycle(Function &F, Attributor &A) {
		jdoerfertUnsubmitted Done Reply Inline Actions Here and in other places you need to fix the formatting, e.g., by running clang format on your code. You should configure your programming editor to run clang format on the patch for you. I have function! Formatonsave() let l:formatdiff = 1 pyf /data/src/llvm-project/clang/tools/clang-format/clang-format.py endfunction autocmd BufWritePre .h,.cc,.cpp call Formatonsave() in my vim configuration. Google should know how that is done for other editors. jdoerfert:* Here and in other places you need to fix the formatting, e.g., by running clang format on your…
		baziotisUnsubmitted Not Done Reply Inline Actions I think you still need to pass `A`. Are you sure this compiles? baziotis: I think you still need to pass `A`. Are you sure this compiles?
		omarahmedAuthorUnsubmitted Not Done Reply Inline Actions ohhh sry tried something and forgot to make it back in this part :) omarahmed: ohhh sry tried something and forgot to make it back in this part :)
		baziotisUnsubmitted Not Done Reply Inline Actions No worries. Try to be sure that any diff you upload compiles and passes the test suite. :) baziotis: No worries. Try to be sure that any diff you upload compiles and passes the test suite. :)
		baziotisUnsubmitted Not Done Reply Inline Actions Caps on the name. Be sure to check the nearby code and use clang-format (it seems you have done so, just reminding). baziotis: Caps on the name. Be sure to check the nearby code and use clang-format (it seems you have done…
		jdoerfertUnsubmitted Not Done Reply Inline Actions The name should now be something like `mayContainUnboundedCycle` jdoerfert: The name should now be something like `mayContainUnboundedCycle`
		lebedev.riUnsubmitted Not Done Reply Inline Actions I'm a little bit lost here, if a loop is not known to have `SmallConstantMaxTripCount`, why does that mean it's unbounded? lebedev.ri: I'm a little bit lost here, if a loop is not known to have `SmallConstantMaxTripCount`, why…
		baziotisUnsubmitted Not Done Reply Inline Actions It doesn't. In a previous review the name `mayContainUnboundedCycle` was proposed IIRC, which I prefer. baziotis: It doesn't. In a previous review the name `mayContainUnboundedCycle` was proposed IIRC, which I…
		baziotisUnsubmitted Not Done Reply Inline Actions Remove the TODO. Then please see the comment above. baziotis: Remove the TODO. Then please see the comment above.
SmallPtrSet<BasicBlock *, 32> Visited;		bool NoAnalysis = false;

// Traverse BB by dfs and check whether successor is already visited.		ScalarEvolution *SE =
for (BasicBlock *BB : depth_first(&F)) {		A.getInfoCache().getAnalysisResultForFunction<ScalarEvolutionAnalysis>(F);
		baziotisUnsubmitted Not Done Reply Inline Actions "If NoAnalysis is available ..." -> "If either SCEV or LoopInfo is not available ..." baziotis: "If NoAnalysis is available ..." -> "If either SCEV or LoopInfo is not available ..."
Visited.insert(BB);		LoopInfo *LI = A.getInfoCache().getAnalysisResultForFunction<LoopAnalysis>(F);
		jdoerfertUnsubmitted Done Reply Inline Actions You have to check if LI or SE are `null`, that can happen. jdoerfert: You have to check if LI or SE are `null`, that can happen.
for (auto *SuccBB : successors(BB)) {
		jdoerfertUnsubmitted Not Done Reply Inline Actions Nit: `bool NoAnalysis = !SE \|\| !LI;` jdoerfert: Nit: `bool NoAnalysis = !SE \|\| !LI;`
		uenokuUnsubmitted Not Done Reply Inline Actions nit: Normally, the word "strongly connected components" contains its maximality. So I think it's more concise to omit the description for maximality. uenoku: nit: Normally, the word "strongly connected components" contains its maximality. So I think…
		baziotisUnsubmitted Not Done Reply Inline Actions I'm not sure there's a definitive answer here. It depends on the definition. If we use the "reachability" definition, then it's not implied. An SCC can contain an SCC. But if we talk about plural, i.e. SCCs of a graph, the definition contains the term partition. Which implies maximality. I've seen both in books and I believe it's a long discussion in which we don't want to get the reader to. The fact that I didn't remember that `scc_iterators` finds maximal SCCs led to 3-4 days at least of working on a wrong solution (in this patch actually). baziotis: I'm not sure there's a definitive answer here. It depends on the definition. If we use the…
		uenokuUnsubmitted Not Done Reply Inline Actions Ok, I'm fine with either way :) uenoku: Ok, I'm fine with either way :)
		baziotisUnsubmitted Not Done Reply Inline Actions Cool :) baziotis: Cool :)
if (Visited.count(SuccBB))		if (!LI \|\| !SE)
		baziotisUnsubmitted Done Reply Inline Actions The `BB` doesn't have to be a loop header I think as `getLoopFor()` will return you for a BB, quoting the doc: "the inner most loop that BB lives in". So, any BB in the loop will do it. If the BB not in a loop, it will return you `null` (which you'll have to check). baziotis: The `BB` doesn't have to be a loop header I think as `getLoopFor()` will return you for a BB…
		baziotisUnsubmitted Not Done Reply Inline Actions "you only need" -> "we only need" baziotis: "you only need" -> "we only need"
		NoAnalysis = true;

		jdoerfertUnsubmitted Done Reply Inline Actions Even if the loop has a constant maximal trip count we cannot assume all loops in the function do. So we need to continue the traversal here. jdoerfert: Even if the loop has a constant maximal trip count we cannot assume all loops in the function…
		baziotisUnsubmitted Done Reply Inline Actions I believe you can get all those (LI, SE and their tests for null) out (i.e. before) of the loop, I don't think they can change. baziotis: I believe you can get all those (LI, SE and their tests for null) out (i.e. before) of the loop…
		jdoerfertUnsubmitted Done Reply Inline Actions Yes. But you also cannot `continue` if LI or SE is null. If you don't have those analysis you need to be conservative about the result. jdoerfert: Yes. But you also cannot `continue` if LI or SE is null. If you don't have those analysis you…
		baziotisUnsubmitted Done Reply Inline Actions True and in the case that either of them is null, I think it will still be faster to work with SCCs rather than falling back to the old version. Specifically, the SCC iterator is lazy. That is, it gets you an SCC when you ask for it, which in your case is either in the beginning or in the `++` of the iterator (Check: https://llvm.org/doxygen/SCCIterator_8h_source.html#l00117. You can see that it calls `GetNextSCC()` which does the job of getting the next SCC. This is done every time you `++` it). So, in summary, check if you have any SCC at all and if yes, return true. baziotis: True and in the case that either of them is null, I think it will still be faster to work with…
		for (scc_iterator<Function *> I = scc_begin(&F), IE = scc_end(&F); I != IE;
		baziotisUnsubmitted Not Done Reply Inline Actions You don't need any of that. This is what was mentioned in a previous diff: Specifically that SCC iterator does not loop through all SCCs, only the top-level one, so you can't do that. baziotis: You don't need any of that. This is what was mentioned in a previous diff: Specifically that…
		omarahmedAuthorUnsubmitted Done Reply Inline Actions I tried to do the old method but with no false positives, from what I understand here we wanted only to detect if a cycle exists only and doesn't care if it is a loop or not so I thought that SCCs evenwhen they will give us the top level ones these top level ones will mean that there is a cycle and if there were not these top level SCC structure and it is only consists of single nodes SCCs it will make us not detect a cycle so i used hasCycle function to detect that , does a top level SCC can exist that is not a cycle and a smaller SCC that is a cycle exist in it ? omarahmed: I tried to do the old method but with no false positives, from what I understand here we wanted…
		baziotisUnsubmitted Not Done Reply Inline Actions Sorry, I wrote these comments quite quickly yesterday. You're right. To answer, at first, no, any SCC unless it is single-node without self-loop. Now, `scc_iterator` uses Tarjan, which finds all the maximal SCCs. Those are SCCs which are as big as they can get. i.e. in which you can't put any other node and still have an SCC. To detect if there's a cycle, you only need to find the maximal ones. Also, note that in this case, you could use the algorithm version with `Processed`. This checked whether there is a cycle in the graph in the same complexity as Tarjan, but it's also simpler. Use what you like. However, if you use the `scc_iterator` version, we need a good explanation (you can pick stuff from this comment). baziotis: Sorry, I wrote these comments quite quickly yesterday. You're right. To answer, at first, no…
		++I) {
		const std::vector<BasicBlock > &SCCBBs = I;
		baziotisUnsubmitted Done Reply Inline Actions You don't want to do this here. You first want to test `L` before you test with its parent because `L` might be equal (in blocks) with your SCC. I think, following the cases I listed can help. First of all, try to get `L`. If you don't have a loop (so, `L` is null) then this SCC is not a loop so return true immediately. Otherwise, test their sizes (SCC and L, in blocks). And the below cases follow: If they're equal: If there's a max trip count, continue with the next SCC. Otherwise, return true. Otherwise, if the SCC is smaller, return true Otherwise (i.e. the SCC is bigger), get the parent loop and do the same steps for the parent loop. Note that this last case effectively implies a loop. Finally, if the SCCs end, return false (i.e. in the end of the function). baziotis: You don't want to do this here. You first want to test `L` before you test with its parent…
		bool SCCHasLoop = I.hasLoop();
		baziotisUnsubmitted Not Done Reply Inline Actions There's https://reviews.llvm.org/D74801. If it gets committed before this, I'll notify you to just change the naming. baziotis: There's https://reviews.llvm.org/D74801. If it gets committed before this, I'll notify you to…
		// When NoAnalysis available then check only if the SCC has loop or not.
		if (NoAnalysis) {
		if (SCCHasLoop)
		lebedev.riUnsubmitted Not Done Reply Inline Actions s/if/whether/ lebedev.ri: s/if/whether/
		return true;
		else
		continue;
		baziotisUnsubmitted Not Done Reply Inline Actions [Nit][Grammar] "we don't know is bounded" -> "we don't know if it is bounded" baziotis: [Nit][Grammar] "we don't know is bounded" -> "we don't know if it is bounded"
		}
		// When current SCC has a single block with no loop continue to the next
		baziotisUnsubmitted Not Done Reply Inline Actions Note that as I mentioned, if a node of a cycle is within a loop, it doesn't mean it is the loop. baziotis: Note that as I mentioned, if a node of a cycle is within a loop, it doesn't mean it is the loop.
		// SCC.
		baziotisUnsubmitted Not Done Reply Inline Actions To me, that's a little bit inaccurate. It seems like this `if` checks if the SCC is a loop or not. But that's actually the whole code that follows. So, I'd replace it with a comment in the line above (exactly above the `getForLoop()`) since these 2 are coupled. `If any random block in this SCC does not belong to a loop, then this SCC is definitely not a loop.` baziotis: To me, that's a little bit inaccurate. It seems like this `if` checks if the SCC is a loop or…
		if (!SCCHasLoop)
		baziotisUnsubmitted Not Done Reply Inline Actions No braces. baziotis: No braces.
		baziotisUnsubmitted Not Done Reply Inline Actions If there's irreducible control, the function may contain non-loop cycles. baziotis: If there's irreducible control, the function may contain non-loop cycles.
		continue;
		jdoerfertUnsubmitted Not Done Reply Inline Actions Nit: Move the decleration of `SCCBBs` here (where it is used). Nit: You probably don't need the iterator but just `SCCBBs.front()`. jdoerfert: Nit: Move the decleration of `SCCBBs` here (where it is used). Nit: You probably don't need the…
		jdoerfertUnsubmitted Not Done Reply Inline Actions if (!L \|\| !SE->getSmallConstantMaxTripCount(L)) return true; You ignore non-loop cycles otherwise. Please add a non-loop cycle test case, aka. irreducible control flow test case. Here is a piece of C that should help create the test: if (foo()) goto entry1; else goto entry2; entry1: if (foo()) goto exit; else goto entry2; entry2: if (foo()) goto exit; else goto entry1; exit: return; jdoerfert: ``` if (!L \|\| !SE->getSmallConstantMaxTripCount(L)) return true; ``` You ignore non-loop…
		baziotisUnsubmitted Not Done Reply Inline Actions Please note that this code (i.e. the original) is wrong anyway, as for example in this it finds a cycle because we don't mark nodes unvisited when trying for a new cycle: define i32* @external_sink_ret2_nrw(i32* %n0, i32* %r0, i32* %w0) { entry: %tobool = icmp ne i32* %n0, null br i1 %tobool, label %if.end, label %if.then if.then: ; preds = %entry br label %return if.end: ; preds = %entry %0 = load i32, i32* %r0, align 4 store i32 %0, i32* %w0, align 4 br label %return return: ; preds = %if.end, %if.then ret i32* %w0 } As I noted in earlier comments, to find all cycles correctly, one has to visit but then unvisit nodes when trying a new cycle. This eventually led to new algorithms which did full cycle detection and I ended up with this algorithm with the caveats mentioned: https://reviews.llvm.org/D74691/new/#1894286 baziotis: Please note that this code (i.e. the original) is wrong anyway, as for example in this it finds…
		jdoerfertUnsubmitted Not Done Reply Inline Actions Not wrong but too conservative ;) That's a big difference. However, I agree, we should be as precise as is reasonably cheap. jdoerfert: Not wrong but too conservative ;) That's a big difference. However, I agree, we should be as…
		baziotisUnsubmitted Not Done Reply Inline Actions If you mean that the actual problem is the speed which is a side-effect of conservation, then yes, definitely it's unreasonably slow (given a complete graph, it has sth like exponential time). Other than that, the fact that a block of a cycle is inside a loop does not mean that the cycle is this loop. Maybe we can use it if the block is special and it has this specific property. For example, if we knew that the header/latch/... of a loop can't be part of a non-loop cycle. I don't know if something like that holds, I'll try to check tomorrow. baziotis: If you mean that the actual problem is the speed which is a side-effect of conservation, then…

		std::vector<BasicBlock *>::const_iterator BBI = SCCBBs.begin();

		Loop L = LI->getLoopFor(BBI);
		baziotisUnsubmitted Not Done Reply Inline Actions Caps starting the sentence. baziotis: Caps starting the sentence.
		jdoerfertUnsubmitted Not Done Reply Inline Actions Nit: Maybe `for (auto &It : make_range(scc_begin(&F), scc_end(&F)))` Or create a helper `scc_range(..)` that create the range. jdoerfert: Nit: Maybe `for (auto &It : make_range(scc_begin(&F), scc_end(&F)))` Or create a helper…
		omarahmedAuthorUnsubmitted Not Done Reply Inline Actions I am kind of stuck in using hasLoop function with using these method for looping :) , so how could I change It( vector<Block>) to SCC_iter<Function > type omarahmed: I am kind of stuck in using hasLoop function with using these method for looping :) , so how…
		jdoerfertUnsubmitted Not Done Reply Inline Actions Never mind then ;) jdoerfert: Never mind then ;)
		baziotisUnsubmitted Not Done Reply Inline Actions One note is that we get the innermost loop with `getLoopFor()` rather than the outermost. Also, the part about the loop cycle was unclear to me. Still, I feel breaking it down into 3 cases is clearer but here's another take: L is the innermost loop that has a common block with the SCC. Since a loop is always an SCC, if their number of blocks are equal, the SCC is a loop - specifically, L. Otherwise, there are 2 cases: - If the SCC has less blocks, then it is definitely not a loop. - If it has more, then we can't decide since the SCC can be a parent loop of L. So, we perform the same test for the parent of L. baziotis: One note is that we get the innermost loop with `getLoopFor()` rather than the outermost. Also…
		baziotisUnsubmitted Not Done Reply Inline Actions Any loop that does not have a max trip count is considered unbounded cycle. baziotis: Any loop that does not have a max trip count is considered unbounded cycle.
		// Check whether the current SCC is a loop or not(Loop is Null).
		baziotisUnsubmitted Not Done Reply Inline Actions As @jdoerfert mentioned, you can replace this with mayContainIrreducibleControl baziotis: As @jdoerfert mentioned, you can replace this with [[ https://github.com/llvm/llvm…
		omarahmedAuthorUnsubmitted Done Reply Inline Actions I found that the function is in mustExcute.cpp only and not available in the header file mustExcute.h so how could i use it in this case i guess it should be added to the header :) omarahmed: I found that the function is in mustExcute.cpp only and not available in the header file…
		baziotisUnsubmitted Not Done Reply Inline Actions Yep, true, it's `static`. Please add it to the header. :) baziotis: Yep, true, it's `static`. Please add it to the header. :)
		omarahmedAuthorUnsubmitted Done Reply Inline Actions i tried to erase static as static means that it will only used in the cpp file also i have put the definition in the header file but that resulted in an error in linking so how could we do that here ? I guess also that all the functions i saw was only static in the cpp file or they are belong to a class in the .h or cpp file but no function that is implemented alone so I think that getting this 3 lines is more clean or what do you think ? omarahmed: i tried to erase static as static means that it will only used in the cpp file also i have put…
		baziotisUnsubmitted Not Done Reply Inline Actions I think it's better if we use the already made function. I'll try to do that myself tomorrow and see what problems come up. [Irrelevant] Some more info about `static`: It means that you can use the function only in this compilation unit. Usually, a compilation unit is a `.cpp` file, but if you include a `.cpp` file in another, those 2 will make one compilation unit. As a side note, `static` has more to do with linking than with compiling. It basically means that the symbol (i.e. the function, which in assembly is just a label) is not exported. So, no code from another object file can jump to it. Here's a great video in case you're interested in this kind of stuff: https://www.youtube.com/watch?v=n4fI4eUTTKM baziotis: I think it's better if we use the already made function. I'll try to do that myself tomorrow…
		baziotisUnsubmitted Not Done Reply Inline Actions So, I had forgotten about `namespace llvm`. You need to put the declaration of the function inside `namespace llvm` in `MustExecute.h`. Then, in `MustExecute.cpp`, you need to remove `static` but add `llvm::` in the name, i.e. `llvm::mayContainIrreducibleControl(...`. In short, the function has to be declared inside the llvm namespace and when you define it, you have to specify that you want to define a function from this specific namespace. baziotis: So, I had forgotten about `namespace llvm`. You need to put the declaration of the function…
		if (!L)
		baziotisUnsubmitted Not Done Reply Inline Actions `return` in a newline baziotis: `return` in a newline
		baziotisUnsubmitted Not Done Reply Inline Actions The SCC might have a loop inside, but it might not be a loop itself. I propose something like: If L is nullptr, we found no loop that matches exactly the number of blocks of the SCC and so the SCC is not a loop Feel free to revise it but still remain accurate. :) baziotis: The SCC might have a loop inside, but it might not be a loop itself. I propose something like…
return true;		return true;

		// Compare the number of blocks of SCC and the loop and check :
		baziotisUnsubmitted Not Done Reply Inline Actions After a little bit of search, `clang-tidy` fails in the Harbomaster because you can declare `L` as `auto L` (instead of `auto L`). You can see the failed check by clicking on the Harbomaster failed build (https://reviews.llvm.org/B48509), then "pre-merge checks"(https://reviews.llvm.org/harbormaster/build/52266/), then "External Link clang-tidy linux" (https://results.llvm-merge-guard.org/amd64_debian_testing_clang8-4435/clang-tidy.txt). baziotis:* After a little bit of search, `clang-tidy` fails in the Harbomaster because you can declare `L`…
		omarahmedAuthorUnsubmitted Done Reply Inline Actions oh sry that was obvious somehow and i didn't saw it,Thanks :) omarahmed: oh sry that was obvious somehow and i didn't saw it,Thanks :)
		// When the loop have more number of blocks then the current SCC is not
		baziotisUnsubmitted Not Done Reply Inline Actions You can cache this as you use it 2 times. baziotis: You can cache this as you use it 2 times.
		jdoerfertUnsubmitted Not Done Reply Inline Actions Just move this conditional before the `NoAnalysis` check. jdoerfert: Just move this conditional before the `NoAnalysis` check.
		baziotisUnsubmitted Not Done Reply Inline Actions As I said in a previous comment, `getLoopsInPreorder()` (or any way that `LoopInfo` gives you access to all the loops) gives you the top-level loops. The fact that a top-level loop has a maximal trip count does not mean that its children loops (i.e. loops entirely contained within this loop) will also have a maximal trip count. So, for every top-level loop, you have to check all its children (i.e. `getSubLoops()`). And all that, in a loop / recursion, because the sub loops may themselves have children loops etc. baziotis: As I said in a previous comment, `getLoopsInPreorder()` (or any way that `LoopInfo` gives you…
		omarahmedAuthorUnsubmitted Done Reply Inline Actions I have tried it with a test like that for(int i=0;i<n;i++) for(int j =0;j<n;j++) for(int k = 0;k<n;k++) for(int l = 0;l<n;l++) and printed the loops that it iterate on and i found it had iterated even on the forth level one , also I have looked on the implementation of the cycle and found that it recursivly was going inside the loops so i guess it goes throw all of the loops not only the top-level ones or I miss something here ? omarahmed: I have tried it with a test like that ``` for(int i=0;i<n;i++) for(int j =0;j<n;j++)…
		omarahmedAuthorUnsubmitted Done Reply Inline Actions I mean the implementation of the function not cycle it is a typo :) omarahmed: I mean the implementation of the function not cycle it is a typo :)
		baziotisUnsubmitted Not Done Reply Inline Actions Yes, also true, it gives you all the loops in preorder, I just checked the function. So, yes that should work. baziotis: Yes, also true, it gives you all the loops in preorder, I just checked the function. So, yes…
		// a loop.
		baziotisUnsubmitted Not Done Reply Inline Actions I'd propose that you add a little bit more here like `... is not a loop since getLoopFor() gives us initially the innermost loop. Then, with getParentLoop(), we move strictly from innermost to outermost. Since a loop is always an SCC, if this SCC was a loop, we would match exactly the number of blocks in one of this loops. If the number of blocks is less, then this SCC is a non-loop cycle inside one of these loops (or just a cycle if no loop exists)`. That's maybe quite verbose, if you think something more concise, you may add it. But in any case, I think this is a subtle point that should not be left without explanation. @jdoerfert what do you think? baziotis: I'd propose that you add a little bit more here like `... is not a loop since getLoopFor()…
		// When they are equal then it is a loop and check the maxTripCount if it
		// has an upper bound.
		// Else we get the parent loop and check again.
		baziotisUnsubmitted Not Done Reply Inline Actions Why? This another subtle point. I think you can use some of the initial description I had written if you think it's good. Note: As a general rule of thumb, in docs, try to comment why something is done rather what is done. The what is usually obvious while the why a lot of times not. baziotis: Why? This another subtle point. I think you can use some of the initial description I had…
		do {
		if (L->getNumBlocks() > SCCBBs.size())
		jdoerfertUnsubmitted Not Done Reply Inline Actions Style: no braces around single statements jdoerfert: Style: no braces around single statements
		jdoerfertUnsubmitted Not Done Reply Inline Actions Style: No `else` after `return`. jdoerfert: Style: No `else` after `return`.
		return true;

		if (L->getNumBlocks() == SCCBBs.size()) {
		jdoerfertUnsubmitted Not Done Reply Inline Actions Style: go over the comments and make them all sentences, e.g., capital starting letter, dot at the end, dot in between sentences, etc. Nit: Remove `(Loop is Null)` jdoerfert: Style: go over the comments and make them all sentences, e.g., capital starting letter, dot at…
		if (!SE->getSmallConstantMaxTripCount(L))
		return true;
		else
		continue;
		baziotisUnsubmitted Not Done Reply Inline Actions I believe here you want to continue to the next SCC but rather you continue to the next parent loop. In this case, since this do-while is the last thing in the SCC loop, you can just break. baziotis: I believe here you want to continue to the next SCC but rather you continue to the next parent…
}		}
		} while (L != L->getParentLoop() && (L = L->getParentLoop()));
		baziotisUnsubmitted Not Done Reply Inline Actions Sorry but this is unclear to me. Could you explain the reasoning? I'd think that the `while` version is easier to deal with, was there a problem with it? baziotis: Sorry but this is unclear to me. Could you explain the reasoning? I'd think that the `while`…
		omarahmedAuthorUnsubmitted Done Reply Inline Actions the while version was making an infinte loop error , I think getParentLoop when there was no parent anymore it returned the same loop and not null so i fixed it by this first check . omarahmed: the while version was making an infinte loop error , I think getParentLoop when there was no…
		omarahmedAuthorUnsubmitted Done Reply Inline Actions oh i didn't see this continue part so i think now it is the one that makes that , i will edit it to the while version again :) omarahmed: oh i didn't see this continue part so i think now it is the one that makes that , i will edit…
		baziotisUnsubmitted Not Done Reply Inline Actions :) Btw, the idea to do a do-while and return if there was no initial loop was very good! Also, I wrote the `while` assuming that if there's no parent loop, `getParentLoop()` will return `null`. I now see there's no documentation in `getParentLoop()`, but given that code: https://llvm.org/doxygen/LoopInfo_8h_source.html#l00097 which chases back `ParentLoop`, which is what `getParentLoop()` returns, it should return null when there's no parent. baziotis: :) Btw, the idea to do a do-while and return if there was no initial loop was very good! Also…
		jdoerfertUnsubmitted Not Done Reply Inline Actions Nit: No braces. jdoerfert: Nit: No braces.
}		}
		baziotisUnsubmitted Not Done Reply Inline Actions You want to put those in a loop. You may need to get parent loop multiple times. Something like that. while (L) { if (L is bigger) return true; if (they're equal) { if (there's no max trip count) return true; else continue; } L = getParent(); } baziotis: You want to put those in a loop. You may need to get parent loop multiple times. Something like…
		jdoerfertUnsubmitted Not Done Reply Inline Actions And please add a comment block in the beginning of the function or before these size checks explaining what is going on here. jdoerfert: And please add a comment block in the beginning of the function or before these size checks…

		jdoerfertUnsubmitted Not Done Reply Inline Actions I have the feeling there is something missing. What if there is an outer SCC with no loop that is as big as the SCC? I think you need a matching loop with max trip count (not even constant btw.) for each SCC. That is, after the do-loop ended, check if `L` is null or not. If it is, we did not find a matching loop. (I'm unsure if that can happen wrt. the block order in SCCs but better save than sorry.) jdoerfert: I have the feeling there is something missing. What if there is an outer SCC with no loop that…
return false;		return false;
}		}

// Helper function that checks the function have a loop which might become an		// Helper function that checks the function have a loop which might become an
// endless loop		// endless loop
// FIXME: Any cycle is regarded as endless loop for now.		// FIXME: Any cycle is regarded as endless loop for now.
// We have to allow some patterns.		// We have to allow some patterns.
		jdoerfertUnsubmitted Not Done Reply Inline Actions The fixme is outdated, remove it. jdoerfert: The fixme is outdated, remove it.
static bool containsPossiblyEndlessLoop(Function *F) {		static bool containsPossiblyEndlessLoop(Function *F, Attributor &A) {
return !F \|\| !F->hasExactDefinition() \|\| containsCycle(*F);		return !F \|\| !F->hasExactDefinition() \|\| containsCycle(*F, A);
}		}

struct AAWillReturnImpl : public AAWillReturn {		struct AAWillReturnImpl : public AAWillReturn {
AAWillReturnImpl(const IRPosition &IRP) : AAWillReturn(IRP) {}		AAWillReturnImpl(const IRPosition &IRP) : AAWillReturn(IRP) {}

/// See AbstractAttribute::initialize(...).		/// See AbstractAttribute::initialize(...).
void initialize(Attributor &A) override {		void initialize(Attributor &A) override {
AAWillReturn::initialize(A);		AAWillReturn::initialize(A);

Function *F = getAssociatedFunction();		Function *F = getAssociatedFunction();
if (containsPossiblyEndlessLoop(F))		if (containsPossiblyEndlessLoop(F, A))
indicatePessimisticFixpoint();		indicatePessimisticFixpoint();
}		}

/// See AbstractAttribute::updateImpl(...).		/// See AbstractAttribute::updateImpl(...).
ChangeStatus updateImpl(Attributor &A) override {		ChangeStatus updateImpl(Attributor &A) override {
auto CheckForWillReturn = [&](Instruction &I) {		auto CheckForWillReturn = [&](Instruction &I) {
IRPosition IPos = IRPosition::callsite_function(ImmutableCallSite(&I));		IRPosition IPos = IRPosition::callsite_function(ImmutableCallSite(&I));
const auto &WillReturnAA = A.getAAFor<AAWillReturn>(*this, IPos);		const auto &WillReturnAA = A.getAAFor<AAWillReturn>(*this, IPos);
▲ Show 20 Lines • Show All 6,318 Lines • Show Last 20 Lines

llvm/test/Transforms/Attributor/read_write_returned_arguments_scc.ll

Show First 20 Lines • Show All 95 Lines • ▼ Show 20 Lines	if.end: ; preds = %entry
%call8 = call i32* @internal_ret0_nw(i32* %r1, i32* %w0)		%call8 = call i32* @internal_ret0_nw(i32* %r1, i32* %w0)
br label %return		br label %return

return: ; preds = %if.end, %if.then		return: ; preds = %if.end, %if.then
%retval.0 = phi i32* [ %call8, %if.end ], [ %r1, %if.then ]		%retval.0 = phi i32* [ %call8, %if.end ], [ %r1, %if.then ]
ret i32* %retval.0		ret i32* %retval.0
}		}

; CHECK: Function Attrs: argmemonly nofree norecurse nosync nounwind		; CHECK: Function Attrs: argmemonly nofree norecurse nosync nounwind willreturn
; CHECK-NEXT: define i32* @external_sink_ret2_nrw(i32* nofree readnone %n0, i32* nocapture nofree readonly %r0, i32* nofree returned writeonly "no-capture-maybe-returned" %w0)		; CHECK-NEXT: define i32* @external_sink_ret2_nrw(i32* nofree readnone %n0, i32* nocapture nofree readonly %r0, i32* nofree returned writeonly "no-capture-maybe-returned" %w0)
define i32* @external_sink_ret2_nrw(i32* %n0, i32* %r0, i32* %w0) {		define i32* @external_sink_ret2_nrw(i32* %n0, i32* %r0, i32* %w0) {
entry:		entry:
%tobool = icmp ne i32* %n0, null		%tobool = icmp ne i32* %n0, null
br i1 %tobool, label %if.end, label %if.then		br i1 %tobool, label %if.end, label %if.then

if.then: ; preds = %entry		if.then: ; preds = %entry
br label %return		br label %return
▲ Show 20 Lines • Show All 42 Lines • ▼ Show 20 Lines	entry:
ret i32* %call1		ret i32* %call1
}		}

; Verify that we see only expected attribute sets, the above lines only check		; Verify that we see only expected attribute sets, the above lines only check
; for a subset relation.		; for a subset relation.
;		;
; CHECK-NOT: attributes #		; CHECK-NOT: attributes #
; CHECK: attributes #{{.*}} = { argmemonly nofree nosync nounwind }		; CHECK: attributes #{{.*}} = { argmemonly nofree nosync nounwind }
; CHECK: attributes #{{.*}} = { argmemonly nofree norecurse nosync nounwind }		; CHECK: attributes #{{.*}} = { argmemonly nofree norecurse nosync nounwind willreturn }
; CHECK: attributes #{{.*}} = { nosync nounwind }		; CHECK: attributes #{{.*}} = { nosync nounwind }
		; CHECK: attributes #{{.*}} = { norecurse nosync nounwind willreturn }
		omarahmedAuthorUnsubmitted Not Done Reply Inline Actions I don't understand this checks here at the end and this line was the only error that exists from the default tests but i don't understand , I added it for asking about it :) omarahmed: I don't understand this checks here at the end and this line was the only error that exists…
		jdoerfertUnsubmitted Not Done Reply Inline Actions The function attributes have changed so we see changes here. That is alright. jdoerfert: The function attributes have changed so we see changes here. That is alright.
; CHECK-NOT: attributes #		; CHECK-NOT: attributes #

This is an archive of the discontinued LLVM Phabricator instance.

[Attributor] Detect possibly unbounded cycles in functionsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 245525

llvm/lib/Transforms/IPO/Attributor.cpp

llvm/test/Transforms/Attributor/read_write_returned_arguments_scc.ll

[Attributor] Detect possibly unbounded cycles in functions
ClosedPublic