This is an archive of the discontinued LLVM Phabricator instance.

[Attributor] Detect possibly unbounded cycles in functions
ClosedPublic

Authored by omarahmed on Feb 16 2020, 7:22 AM.

Details

Summary

This patch add mayContainUnboundedCycle helper function which checks whether a function has any cycle which we don't know if it is bounded or not.
Loops with maximum trip count are considered bounded, any other cycle not.
It also contains some fixed tests and some added tests contain bounded and
unbounded loops and non-loop cycles.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
omarahmed added inline comments.Feb 19 2020, 4:58 PM
llvm/lib/Transforms/IPO/Attributor.cpp
2464

the while version was making an infinte loop error , I think getParentLoop when there was no parent anymore it returned the same loop and not null so i fixed it by this first check .

omarahmed marked an inline comment as done.Feb 19 2020, 5:00 PM
omarahmed added inline comments.
llvm/lib/Transforms/IPO/Attributor.cpp
2464

oh i didn't see this continue part so i think now it is the one that makes that , i will edit it to the while version again :)

baziotis added inline comments.Feb 20 2020, 2:44 AM
llvm/lib/Transforms/IPO/Attributor.cpp
2464

:)
Btw, the idea to do a do-while and return if there was no initial loop was very good!
Also, I wrote the while assuming that if there's no parent loop, getParentLoop() will return null. I now see there's no documentation in getParentLoop(), but given that code: https://llvm.org/doxygen/LoopInfo_8h_source.html#l00097
which chases back ParentLoop, which is what getParentLoop() returns, it should return null when there's no parent.

omarahmed added a comment.EditedFeb 21 2020, 12:44 PM

@jdoerfert I have found examples in the willreturn.ll file for explicit tests for bounded and unbounded loops and I fixed them , does we need to add more complex examples like a bounded loop inside unbounded loop or this is redundant and not important ?

@jdoerfert I have found examples in the willreturn.ll file for explicit tests for bounded and unbounded loops and I fixed them , does we need to add more complex examples like a bounded loop inside unbounded loop or this is redundant and not important ?

I think you can first of all upload your diff and it will be easier to comment on what is missing. Some tests you can add (or check if there are some already in willreturn.ll) are basically stressing the algorithm.
That e.g. means add a quite deep loop. Add a non-loop SCC inside a loop. Add a loop that has max trip count inside another with max trip count (so you can check the moving upwards).

@jdoerfert I have found examples in the willreturn.ll file for explicit tests for bounded and unbounded loops and I fixed them , does we need to add more complex examples like a bounded loop inside unbounded loop or this is redundant and not important ?

I think you can first of all upload your diff and it will be easier to comment on what is missing. Some tests you can add (or check if there are some already in willreturn.ll) are basically stressing the algorithm.
That e.g. means add a quite deep loop. Add a non-loop SCC inside a loop. Add a loop that has max trip count inside another with max trip count (so you can check the moving upwards).

Yes. Please keep the diff up to date so it is easier to follow.

[ Btw. if someone is up for it I have already the next extension in mind. though we can tackle that once this one is done ]

@jdoerfert I have found examples in the willreturn.ll file for explicit tests for bounded and unbounded loops and I fixed them , does we need to add more complex examples like a bounded loop inside unbounded loop or this is redundant and not important ?

I think you can first of all upload your diff and it will be easier to comment on what is missing. Some tests you can add (or check if there are some already in willreturn.ll) are basically stressing the algorithm.
That e.g. means add a quite deep loop. Add a non-loop SCC inside a loop. Add a loop that has max trip count inside another with max trip count (so you can check the moving upwards).

Yes. Please keep the diff up to date so it is easier to follow.

[ Btw. if someone is up for it I have already the next extension in mind. though we can tackle that once this one is done ]

I'm up for it. Although I'd like to wait for Omar to finish this one and see if he thinks he can tackle the next (assuming he wants to). :)

omarahmed updated this revision to Diff 246051.Feb 21 2020, 6:16 PM
  • improve the docs and add tests
omarahmed updated this revision to Diff 246052.Feb 21 2020, 6:20 PM

improve the docs and add tests

I'm up for it. Although I'd like to wait for Omar to finish this one and see if he thinks he can tackle the next (assuming he wants to). :)

I'm ready for it :)

jdoerfert added a comment.EditedFeb 21 2020, 6:43 PM

I added a bunch of comments. Mostly minor. I think this is almost done and I like how it turned out. Once the comments are addressed I'm fine with this, @baziotis feel free to accept then.

The next step to improve willreturn is actually to allow loops with unknown bounds, e.g., link-list traversals but also SCCs that are not loops. During initialize we collect all the ones that might be endless, and in the updateImpl we have to "justify" why they are not. The justification criterion is: they do not contain any synchronizing instruction. AANoSync can help with identifying those ;)
@omarahmed Let me know if that is something you want to look into (eventually). If so, we can talk about it more.

[EDIT: Now that I think about it for another second, we should probably split that logic I describe above into a new abstract attribute, e.g., AALoopInformation. We also need to sync up with @uenoku, I remember vaguely that he looked into something at similar at some point but it's all pretty foggy...]

llvm/lib/Transforms/IPO/Attributor.cpp
2435–2445

Nit: No braces.

2439

Nit: Move the decleration of SCCBBs here (where it is used).
Nit: You probably don't need the iterator but just SCCBBs.front().

2445

Nit: Maybe for (auto &It : make_range(scc_begin(&F), scc_end(&F)))
Or create a helper scc_range(..) that create the range.

2446

I have the feeling there is something missing. What if there is an outer SCC with no loop that is as big as the SCC? I think you need a matching loop with max trip count (not even constant btw.) for each SCC. That is, after the do-loop ended, check if L is null or not. If it is, we did not find a matching loop.
(I'm unsure if that can happen wrt. the block order in SCCs but better save than sorry.)

2455

Style: No else after return.

2458

Style: go over the comments and make them all sentences, e.g., capital starting letter, dot at the end, dot in between sentences, etc.
Nit: Remove (Loop is Null)

llvm/test/Transforms/Attributor/willreturn.ll
596

I like the tests a lot. Can we add one or two more with irreducible control, hence SCCs that are not (natural) "loops".

I added a bunch of comments. Mostly minor. I think this is almost done and I like how it turned out. Once the comments are addressed I'm fine with this, @baziotis feel free to accept then.

Ok, yes.

The next step to improve willreturn is actually to allow loops with unknown bounds, e.g., link-list traversals but also SCCs that are not loops. During initialize we collect all the ones that might be endless, and in the updateImpl we have to "justify" why they are not. The justification criterion is: they do not contain any synchronizing instruction. AANoSync can help with identifying those ;)
@omarahmed Let me know if that is something you want to look into (eventually). If so, we can talk about it more.

[EDIT: Now that I think about it for another second, we should probably split that logic I describe above into a new abstract attribute, e.g., AALoopInformation. We also need to sync up with @uenoku, I remember vaguely that he looked into something at similar at some point but it's all pretty foggy...]

I may be missing something because I'm not familiar with synchronizing instructions but why does that guarantee that a loop will end? Or why can we assume it will end?
I was reading the C11 standard:

An iteration statement whose controlling expression is not a constant expression, that performs no input/output operations, does not access volatile objects, and performs no synchronization or atomic operations in its body, controlling expression, or (in the case of a for statement) its expression-3, may be assumed by the implementation to terminate.

That is, basically we can assume a loop ends (even if it doesn't actually) if it is side-effect free. That is, it has no sync instructions, no read/write instructions and no volatile accesses.
I don't know if we follow the C11 standard but don't we fill only 1 of 3 if we only test for synchronizing instructions?

I added a bunch of comments. Mostly minor. I think this is almost done and I like how it turned out. Once the comments are addressed I'm fine with this, @baziotis feel free to accept then.

Ok, yes.

The next step to improve willreturn is actually to allow loops with unknown bounds, e.g., link-list traversals but also SCCs that are not loops. During initialize we collect all the ones that might be endless, and in the updateImpl we have to "justify" why they are not. The justification criterion is: they do not contain any synchronizing instruction. AANoSync can help with identifying those ;)
@omarahmed Let me know if that is something you want to look into (eventually). If so, we can talk about it more.

[EDIT: Now that I think about it for another second, we should probably split that logic I describe above into a new abstract attribute, e.g., AALoopInformation. We also need to sync up with @uenoku, I remember vaguely that he looked into something at similar at some point but it's all pretty foggy...]

I may be missing something because I'm not familiar with synchronizing instructions but why does that guarantee that a loop will end? Or why can we assume it will end?
I was reading the C11 standard:

An iteration statement whose controlling expression is not a constant expression, that performs no input/output operations, does not access volatile objects, and performs no synchronization or atomic operations in its body, controlling expression, or (in the case of a for statement) its expression-3, may be assumed by the implementation to terminate.

That is, basically we can assume a loop ends (even if it doesn't actually) if it is side-effect free. That is, it has no sync instructions, no read/write instructions and no volatile accesses.
I don't know if we follow the C11 standard but don't we fill only 1 of 3 if we only test for synchronizing instructions?

C/C++ atomics and volatile are considered synchronization, according to AANoSync, so we are good on that front I think. Or did you mean something else?

The one thing we should do however is start a discussion about a no-forward-process-guarantee attribute on llvm-dev. People that do not come from C/C++ already asked for that (https://bugs.llvm.org/show_bug.cgi?id=965). If you read the bug comments and look at the dates, we have that problem for a while and I think the function attribute is the cleanest solution here.

I added a bunch of comments. Mostly minor. I think this is almost done and I like how it turned out. Once the comments are addressed I'm fine with this, @baziotis feel free to accept then.

Ok, yes.

The next step to improve willreturn is actually to allow loops with unknown bounds, e.g., link-list traversals but also SCCs that are not loops. During initialize we collect all the ones that might be endless, and in the updateImpl we have to "justify" why they are not. The justification criterion is: they do not contain any synchronizing instruction. AANoSync can help with identifying those ;)
@omarahmed Let me know if that is something you want to look into (eventually). If so, we can talk about it more.

[EDIT: Now that I think about it for another second, we should probably split that logic I describe above into a new abstract attribute, e.g., AALoopInformation. We also need to sync up with @uenoku, I remember vaguely that he looked into something at similar at some point but it's all pretty foggy...]

I may be missing something because I'm not familiar with synchronizing instructions but why does that guarantee that a loop will end? Or why can we assume it will end?
I was reading the C11 standard:

An iteration statement whose controlling expression is not a constant expression, that performs no input/output operations, does not access volatile objects, and performs no synchronization or atomic operations in its body, controlling expression, or (in the case of a for statement) its expression-3, may be assumed by the implementation to terminate.

That is, basically we can assume a loop ends (even if it doesn't actually) if it is side-effect free. That is, it has no sync instructions, no read/write instructions and no volatile accesses.
I don't know if we follow the C11 standard but don't we fill only 1 of 3 if we only test for synchronizing instructions?

C/C++ atomics and volatile are considered synchronization, according to AANoSync, so we are good on that front I think. Or did you mean something else?

That's what I figured by reading AANoSync but I had some wrong terminology in my head for the term "atomic". So, yes, you're right.

The one thing we should do however is start a discussion about a no-forward-process-guarantee attribute on llvm-dev. People that do not come from C/C++ already asked for that (https://bugs.llvm.org/show_bug.cgi?id=965). If you read the bug comments and look at the dates, we have that problem for a while and I think the function attribute is the cleanest solution here.

no-forward-process-guarantee is basically the attribute version of containsPossiblyEndlessLoop() right (when it returns true) ? (Although, I believe we should name this with process -> progress).
I don't quite get why this will be helpful though. I got that Reid Kleckner proposed forward-progress-guarantee so that we can put into functions in which we can remove infinite loops with no side-effects.

Btw, the search for some loops may need to happen in AAUB, since it is considered UB if I understood correctly.

I added a bunch of comments. Mostly minor. I think this is almost done and I like how it turned out. Once the comments are addressed I'm fine with this, @baziotis feel free to accept then.

Ok, yes.

The next step to improve willreturn is actually to allow loops with unknown bounds, e.g., link-list traversals but also SCCs that are not loops. During initialize we collect all the ones that might be endless, and in the updateImpl we have to "justify" why they are not. The justification criterion is: they do not contain any synchronizing instruction. AANoSync can help with identifying those ;)
@omarahmed Let me know if that is something you want to look into (eventually). If so, we can talk about it more.

[EDIT: Now that I think about it for another second, we should probably split that logic I describe above into a new abstract attribute, e.g., AALoopInformation. We also need to sync up with @uenoku, I remember vaguely that he looked into something at similar at some point but it's all pretty foggy...]

I may be missing something because I'm not familiar with synchronizing instructions but why does that guarantee that a loop will end? Or why can we assume it will end?
I was reading the C11 standard:

An iteration statement whose controlling expression is not a constant expression, that performs no input/output operations, does not access volatile objects, and performs no synchronization or atomic operations in its body, controlling expression, or (in the case of a for statement) its expression-3, may be assumed by the implementation to terminate.

That is, basically we can assume a loop ends (even if it doesn't actually) if it is side-effect free. That is, it has no sync instructions, no read/write instructions and no volatile accesses.
I don't know if we follow the C11 standard but don't we fill only 1 of 3 if we only test for synchronizing instructions?

C/C++ atomics and volatile are considered synchronization, according to AANoSync, so we are good on that front I think. Or did you mean something else?

That's what I figured by reading AANoSync but I had some wrong terminology in my head for the term "atomic". So, yes, you're right.

The one thing we should do however is start a discussion about a no-forward-process-guarantee attribute on llvm-dev. People that do not come from C/C++ already asked for that (https://bugs.llvm.org/show_bug.cgi?id=965). If you read the bug comments and look at the dates, we have that problem for a while and I think the function attribute is the cleanest solution here.

no-forward-process-guarantee is basically the attribute version of containsPossiblyEndlessLoop() right (when it returns true) ? (Although, I believe we should name this with process -> progress).
I don't quite get why this will be helpful though. I got that Reid Kleckner proposed forward-progress-guarantee so that we can put into functions in which we can remove infinite loops with no side-effects.

Btw, the search for some loops may need to happen in AAUB, since it is considered UB if I understood correctly.

So infinite loops without progress are UB, I think, but loops that might be infinite without progress are not necessarily. We can remove the latter and insert an unreachable for UB as always.

The attribute would actually allow or prevent the above, deepening which default we choose. Basically opt-in or out of the forward progress which enables or disables the proposed reasoning.

I added a bunch of comments. Mostly minor. I think this is almost done and I like how it turned out. Once the comments are addressed I'm fine with this, @baziotis feel free to accept then.

Ok, yes.

The next step to improve willreturn is actually to allow loops with unknown bounds, e.g., link-list traversals but also SCCs that are not loops. During initialize we collect all the ones that might be endless, and in the updateImpl we have to "justify" why they are not. The justification criterion is: they do not contain any synchronizing instruction. AANoSync can help with identifying those ;)
@omarahmed Let me know if that is something you want to look into (eventually). If so, we can talk about it more.

[EDIT: Now that I think about it for another second, we should probably split that logic I describe above into a new abstract attribute, e.g., AALoopInformation. We also need to sync up with @uenoku, I remember vaguely that he looked into something at similar at some point but it's all pretty foggy...]

I may be missing something because I'm not familiar with synchronizing instructions but why does that guarantee that a loop will end? Or why can we assume it will end?
I was reading the C11 standard:

An iteration statement whose controlling expression is not a constant expression, that performs no input/output operations, does not access volatile objects, and performs no synchronization or atomic operations in its body, controlling expression, or (in the case of a for statement) its expression-3, may be assumed by the implementation to terminate.

That is, basically we can assume a loop ends (even if it doesn't actually) if it is side-effect free. That is, it has no sync instructions, no read/write instructions and no volatile accesses.
I don't know if we follow the C11 standard but don't we fill only 1 of 3 if we only test for synchronizing instructions?

C/C++ atomics and volatile are considered synchronization, according to AANoSync, so we are good on that front I think. Or did you mean something else?

That's what I figured by reading AANoSync but I had some wrong terminology in my head for the term "atomic". So, yes, you're right.

The one thing we should do however is start a discussion about a no-forward-process-guarantee attribute on llvm-dev. People that do not come from C/C++ already asked for that (https://bugs.llvm.org/show_bug.cgi?id=965). If you read the bug comments and look at the dates, we have that problem for a while and I think the function attribute is the cleanest solution here.

no-forward-process-guarantee is basically the attribute version of containsPossiblyEndlessLoop() right (when it returns true) ? (Although, I believe we should name this with process -> progress).
I don't quite get why this will be helpful though. I got that Reid Kleckner proposed forward-progress-guarantee so that we can put into functions in which we can remove infinite loops with no side-effects.

Btw, the search for some loops may need to happen in AAUB, since it is considered UB if I understood correctly.

So infinite loops without progress are UB, I think

Me too, because of the C11 standard excerpt I quoted above.

but loops that might be infinite without progress are not necessarily. We can remove the latter and insert an unreachable for UB as always.

Yes, that's what I meant. Assuming that here you meant "with", not "without" and "former", not "latter". Or otherwise I missed something.

The attribute would actually allow or prevent the above, deepening which default we choose. Basically opt-in or out of the forward progress which enables or disables the proposed reasoning.

Oh, ok. So, my assumption was:

  • forward-progress would mean: You can remove side-effect-free infinite loops. One implication of that is: This function _definitely_ forwards progress although it might never return.

This is also how I think people in llvm-dev wanted to use that: If you're compiling a C/C++ function, put this attribute so you can do this optimization. Otherwise, don't, so you can't.

  • On the other hand, no-forward-progress-guarantee I think does not make sense (because of its name) to mean: You can't remove such loops. I got that it means: "This function may get into a side-effect-free infinite loop".

Which is a piece of information nonetheless.

All in all, definitely I'm not the one to decide but I would be glad if you could clarify which of the above are planned (or you plan) to be added.

omarahmed marked an inline comment as done.Feb 22 2020, 12:20 PM
omarahmed added inline comments.
llvm/lib/Transforms/IPO/Attributor.cpp
2445

I am kind of stuck in using hasLoop function with using these method for looping :) , so how could I change It( vector<Block*>) to SCC_iter<Function *> type

[...]
Btw, the search for some loops may need to happen in AAUB, since it is considered UB if I understood correctly.

So infinite loops without progress are UB, I think

Me too, because of the C11 standard excerpt I quoted above.

but loops that might be infinite without progress are not necessarily. We can remove the latter and insert an unreachable for UB as always.

Yes, that's what I meant. Assuming that here you meant "with", not "without" and "former", not "latter". Or otherwise I missed something.

Assuming forward-progress-guarantees, as we do in LLVM now, you can:

  • Remove a loop that has no side-effect/sync, regardless of the trip count.
  • Replace a loop that has no side-effect/sync with an unreachable, if it is proven endless.

The second transformation is "stronger" but also more restrictive. For example,

while(c) {}

cannot be replaced with unreachable but can be removed if you don't know anything about c. You
can replace the loop with llvm.assume(!c) if you want.

The attribute would actually allow or prevent the above, deepening which default we choose. Basically opt-in or out of the forward progress which enables or disables the proposed reasoning.

Oh, ok. So, my assumption was:

  • forward-progress would mean: You can remove side-effect-free infinite loops. One implication of that is: This function _definitely_ forwards progress although it might never return.

This is also how I think people in llvm-dev wanted to use that: If you're compiling a C/C++ function, put this attribute so you can do this optimization. Otherwise, don't, so you can't.

  • On the other hand, no-forward-progress-guarantee I think does not make sense (because of its name) to mean: You can't remove such loops. I got that it means: "This function may get into a side-effect-free infinite loop".

Which is a piece of information nonetheless.

no-forward-progress-guarantee would effectively mean that you cannot remove or replace loops that do not have side-effects/syncs. On way of thinking of this is: (non-)termination becomes observable and threads might be interrupted by events even if there is no direct control flow path.

All in all, definitely I'm not the one to decide but I would be glad if you could clarify which of the above are planned (or you plan) to be added.

I'm not necessarily favoring a default but we need to propose this as an RFC on the list and see where people stand.

llvm/lib/Transforms/IPO/Attributor.cpp
2445

Never mind then ;)

omarahmed updated this revision to Diff 246096.Feb 22 2020, 9:25 PM
omarahmed marked an inline comment as not done.

minor updates and added tests

I'm OK with this. @baziotis @uenoku please take a look, comment and/or accept it as you see fit.

llvm/test/Transforms/Attributor/willreturn.ll
707

New test cases are great, not irreducible but still.

I just saw, the commit subject and message needs to be written though. Explaining what this patch does.

omarahmed updated this revision to Diff 246101.Feb 23 2020, 4:27 AM

[Attributor]Detect SCCs with unbounded cycles

  1. Updating D74691: [Attributor] add some pattern to containsCycle #
  2. Enter a brief description of the changes included in this update.
  3. The first line is used as subject, next lines as comment. #
  4. If you intended to create a new revision, use:
  5. $ arc diff --create

This patch implements containsUnboundedCycle which checks for unbounded cycle
which is either a loop that does not have a maximal trip count or any other
cycle.

It contains some fixes for willreturn tests and addition of bounded
and unbounded loop tests.

[...]
Btw, the search for some loops may need to happen in AAUB, since it is considered UB if I understood correctly.

So infinite loops without progress are UB, I think

Me too, because of the C11 standard excerpt I quoted above.

but loops that might be infinite without progress are not necessarily. We can remove the latter and insert an unreachable for UB as always.

Yes, that's what I meant. Assuming that here you meant "with", not "without" and "former", not "latter". Or otherwise I missed something.

Assuming forward-progress-guarantees, as we do in LLVM now, you can:

  • Remove a loop that has no side-effect/sync, regardless of the trip count.
  • Replace a loop that has no side-effect/sync with an unreachable, if it is proven endless.

The second transformation is "stronger" but also more restrictive. For example,

while(c) {}

cannot be replaced with unreachable but can be removed if you don't know anything about c. You
can replace the loop with llvm.assume(!c) if you want.

Ok, clearer, thanks!

The attribute would actually allow or prevent the above, deepening which default we choose. Basically opt-in or out of the forward progress which enables or disables the proposed reasoning.

Oh, ok. So, my assumption was:

  • forward-progress would mean: You can remove side-effect-free infinite loops. One implication of that is: This function _definitely_ forwards progress although it might never return.

This is also how I think people in llvm-dev wanted to use that: If you're compiling a C/C++ function, put this attribute so you can do this optimization. Otherwise, don't, so you can't.

  • On the other hand, no-forward-progress-guarantee I think does not make sense (because of its name) to mean: You can't remove such loops. I got that it means: "This function may get into a side-effect-free infinite loop".

Which is a piece of information nonetheless.

no-forward-progress-guarantee would effectively mean that you cannot remove or replace loops that do not have side-effects/syncs. On way of thinking of this is: (non-)termination becomes observable and threads might be interrupted by events even if there is no direct control flow path.

Yes, it makes sense.

All in all, definitely I'm not the one to decide but I would be glad if you could clarify which of the above are planned (or you plan) to be added.

I'm not necessarily favoring a default but we need to propose this as an RFC on the list and see where people stand.

Me neither, and I'm positive to the RFC of course. If I can help in any way, ping me. I'll definitely take part in the review. :)

llvm/lib/Transforms/IPO/Attributor.cpp
2431

There's https://reviews.llvm.org/D74801. If it gets committed before this, I'll notify you to just change the naming.

2440

To me, that's a little bit inaccurate. It seems like this if checks if the SCC is a loop or not. But that's actually the whole code that follows.
So, I'd replace it with a comment in the line above (exactly above the getForLoop()) since these 2 are coupled.
If any random block in this SCC does not belong to a loop, then this SCC is definitely not a loop.

2444–2445

The SCC might have a loop inside, but it might not be a loop itself. I propose something like:

If L is nullptr, we found no loop that matches exactly the number of blocks of the SCC and so the SCC is not a loop

Feel free to revise it but still remain accurate. :)

2444–2446

One note is that we get the innermost loop with getLoopFor() rather than the outermost. Also, the part about the loop cycle was unclear to me. Still, I feel breaking it down into 3 cases is clearer but here's another take:

L is the innermost loop that has a common block with the SCC. Since a loop is always an SCC, if their number of blocks
are equal, the SCC is a loop - specifically, L. Otherwise, there are 2 cases:
- If the SCC has less blocks, then it is definitely not a loop.
- If it has more, then we can't decide since the SCC can be a parent loop of L. So, we perform the same test for
the parent of L.

[Attributor]Detect SCCs with unbounded cycles

  1. Updating D74691: [Attributor] add some pattern to containsCycle #
  2. Enter a brief description of the changes included in this update.
  3. The first line is used as subject, next lines as comment. #
  4. If you intended to create a new revision, use:
  5. $ arc diff --create

You don't need this part anywhere.

This patch implements containsUnboundedCycle which checks for unbounded cycle
which is either a loop that does not have a maximal trip count or any other
cycle.

It contains some fixes for willreturn tests and addition of bounded
and unbounded loop tests.

That commit message I think is very good. It's good if you put it in the summary as well.

omarahmed edited the summary of this revision. (Show Details)Feb 23 2020, 11:43 AM
omarahmed retitled this revision from [Attributor] add some pattern to containsCycle to [Attributor] Detect SCCs with unbounded cycles.

[Attributor]Detect SCCs with unbounded cycles

This patch implements containsUnboundedCycle which checks for unbounded cycle
which is either a loop that does not have a maximal trip count or any other
cycle.

It contains some fixes for willreturn tests and addition of bounded
and unbounded loop tests.

The patch description is not very useful.
It should actually explain the patch, and why that is the correct solution.

llvm/lib/Transforms/IPO/Attributor.cpp
2417

The old comment is strictly more understandable.
New comment does not mention at all that we are checking inside functions only.

2434

s/if/whether/

baziotis added inline comments.Feb 23 2020, 12:00 PM
llvm/lib/Transforms/IPO/Attributor.cpp
2417

Yeah, the addition that we check inside a function would be good. Otherwise, I think the new one is clearer on what it is considered an unbounded cycle in this case.

baziotis added a comment.EditedFeb 23 2020, 12:46 PM

@omarahmed @jdoerfert

I'm in the unpleasant position to tell you that the method I proposed is wrong. SCCIterator uses Tarjan's algorithm which for some reason, I remembered it finds all strongly-connected components but actually, it finds only maximal.
That means, if a loop has an SCC inside it, we'll never find it since the loop is the maximal SCC. That probably means that we have to fall-back to the method that Johannes had proposed:

As before, we identify all cycles via the depth-first search and the visited set. If a cycles is found we did bail before but now we ask LI & SE instead. If they say we found a proper loop header of a loop with a bounded trip count we can ignore that cycles and continue exploring.

I'm really sorry for that. I'll try to think if anything better can be done.

Edit: Let's not forget of course: A big thanks to @Meinersbur for pointing this out.

@omarahmed @jdoerfert

I'm in the unpleasant position to tell you that the method I proposed is wrong. SCCIterator uses Tarjan's algorithm which for some reason, I remembered it finds all strongly-connected components but actually, it finds only maximal.
That means, if a loop has an SCC inside it, we'll never find it since the loop is the maximal SCC. That probably means that we have to fall-back to the method that Johannes had proposed:

As before, we identify all cycles via the depth-first search and the visited set. If a cycles is found we did bail before but now we ask LI & SE instead. If they say we found a proper loop header of a loop with a bounded trip count we can ignore that cycles and continue exploring.

I'm really sorry for that. I'll try to think if anything better can be done.

Edit: Let's not forget of course: A big thanks to @Meinersbur for pointing this out.

that's fine , no problem :D
I think i can move faster now as i know the small mistakes i have done till now from the great reviews :)

But if it finds only the maximum ones why it hadn't returned willreturn here , as it shouldn't have seen the while inside

; int non_loop_inside_loop(int n) {
;   int ans = 0;
;   for (int i = 0; i < n; i++) {
;     while (1)
;       ans++;
;   }
;   return ans;
; }

@omarahmed @jdoerfert

I'm in the unpleasant position to tell you that the method I proposed is wrong. SCCIterator uses Tarjan's algorithm which for some reason, I remembered it finds all strongly-connected components but actually, it finds only maximal.
That means, if a loop has an SCC inside it, we'll never find it since the loop is the maximal SCC. That probably means that we have to fall-back to the method that Johannes had proposed:

As before, we identify all cycles via the depth-first search and the visited set. If a cycles is found we did bail before but now we ask LI & SE instead. If they say we found a proper loop header of a loop with a bounded trip count we can ignore that cycles and continue exploring.

I'm really sorry for that. I'll try to think if anything better can be done.

Edit: Let's not forget of course: A big thanks to @Meinersbur for pointing this out.

that's fine , no problem :D
I think i can move faster now as i know the small mistakes i have done till now from the great reviews :)

But if it finds only the maximum ones why it hadn't returned willreturn here , as it shouldn't have seen the while inside

; int non_loop_inside_loop(int n) {
;   int ans = 0;
;   for (int i = 0; i < n; i++) {
;     while (1)
;       ans++;
;   }
;   return ans;
; }

It's good to refer to the LLVM IR, rather than the C/C++ because they can have multiple LLVM IR translations. Just to be sure we're talking about the same thing. :)
In this case, I assume you meant the relevant test case in willreturn.ll. If you put it in a file on its own and run it through the Attributor, with a couple of prints, you'll
see that NoAnalysis is true (btw, I recommend that you try to answer questions by following the code - it's always a learning experience).
I don't know why and that's a good question (on another topic, but still). Interestingly, both LI and SE are null. So, for some reason, for this function, we couldn't
get either of those analyses. I'll try to find some time to check tomorrow.

@omarahmed @jdoerfert

I'm in the unpleasant position to tell you that the method I proposed is wrong. SCCIterator uses Tarjan's algorithm which for some reason, I remembered it finds all strongly-connected components but actually, it finds only maximal.
That means, if a loop has an SCC inside it, we'll never find it since the loop is the maximal SCC. That probably means that we have to fall-back to the method that Johannes had proposed:

As before, we identify all cycles via the depth-first search and the visited set. If a cycles is found we did bail before but now we ask LI & SE instead. If they say we found a proper loop header of a loop with a bounded trip count we can ignore that cycles and continue exploring.

I'm really sorry for that. I'll try to think if anything better can be done.

Edit: Let's not forget of course: A big thanks to @Meinersbur for pointing this out.

that's fine , no problem :D
I think i can move faster now as i know the small mistakes i have done till now from the great reviews :)

But if it finds only the maximum ones why it hadn't returned willreturn here , as it shouldn't have seen the while inside

; int non_loop_inside_loop(int n) {
;   int ans = 0;
;   for (int i = 0; i < n; i++) {
;     while (1)
;       ans++;
;   }
;   return ans;
; }

It's good to refer to the LLVM IR, rather than the C/C++ because they can have multiple LLVM IR translations. Just to be sure we're talking about the same thing. :)
In this case, I assume you meant the relevant test case in willreturn.ll. If you put it in a file on its own and run it through the Attributor, with a couple of prints, you'll
see that NoAnalysis is true (btw, I recommend that you try to answer questions by following the code - it's always a learning experience).
I don't know why and that's a good question (on another topic, but still). Interestingly, both LI and SE are null. So, for some reason, for this function, we couldn't
get either of those analyses. I'll try to find some time to check tomorrow.

that's a great tip,thank you, you are right :)

I

btw, I recommend that you try to answer questions by following the code - it's always a learning experience).

Since we took that road, let's have a little more fun. Let's start simple by printing the SCCs we get. A simple way to do that is with sth like:

for (scc_iterator<Function *> It = scc_begin(&F), IE = scc_end(&F); It != IE;
     ++It) {

  const std::vector<BasicBlock *> &SCCBBs = *It;
  for (BasicBlock *BB : SCCBBs) {
    dbgs() << *BB << "\n";
  }
  dbgs() << "----- END -----\n\n\n\n\n";
}

But, we can actually do better if we take a graphical view of the CFG, with view-cfg. You can do that with sth like:
./bin/opt -view-cfg test.ll
Assuming that you are in the llvm-project dir, that you have built opt and that your file is test.ll. Now, this will generate a .dot file.
This is a special "graphics" format for which we should not care about right now. If you do that, you'll probably see something like:

...
Writing '/tmp/cfgnon_loop_inside_loop-a0c3a3.dot'...  done. 
Trying 'xdg-open' program... Remember to erase graph file: /tmp/cfgnon_loop_inside_loop-a0c3a3.dot
gio: file:///tmp/cfgnon_loop_inside_loop-a0c3a3.dot: No application is registered as handling this file

For me, it created /tmp/cfgnon_loop_inside_loop-a0c3a3.dot (you won't necessarily have the same name, that's ok). Then, it will try to find
default program to open it (with the xdg-open command), which as you can see, for me it didn't work. Basically, you now have to have a program that understands
that file. The dot app will do the job, but you'll probably won't have it by default. Search online for how to install graphviz package. For example, in Ubuntu I think you can do it with sudo apt-get install graphviz.
Finally, you should be able to create a PDF out of the .dot file as:
dot -Tpdf /tmp/cfgnon_loop_inside_loop-a0c3a3.dot -o <pdf_filename>.pdf
Then, you can open the pdf and expect to see sth like this: https://imgur.com/a/oW2xgNt
I think viewing CFGs like that, at times can be very helpful. In this case for example, it becomes instantly apparent that the for and the while don't form a SCC.

btw, I recommend that you try to answer questions by following the code - it's always a learning experience).

Since we took that road, let's have a little more fun. Let's start simple by printing the SCCs we get. A simple way to do that is with sth like:

for (scc_iterator<Function *> It = scc_begin(&F), IE = scc_end(&F); It != IE;
     ++It) {

  const std::vector<BasicBlock *> &SCCBBs = *It;
  for (BasicBlock *BB : SCCBBs) {
    dbgs() << *BB << "\n";
  }
  dbgs() << "----- END -----\n\n\n\n\n";
}

But, we can actually do better if we take a graphical view of the CFG, with view-cfg. You can do that with sth like:
./bin/opt -view-cfg test.ll
Assuming that you are in the llvm-project dir, that you have built opt and that your file is test.ll. Now, this will generate a .dot file.
This is a special "graphics" format for which we should not care about right now. If you do that, you'll probably see something like:

...
Writing '/tmp/cfgnon_loop_inside_loop-a0c3a3.dot'...  done. 
Trying 'xdg-open' program... Remember to erase graph file: /tmp/cfgnon_loop_inside_loop-a0c3a3.dot
gio: file:///tmp/cfgnon_loop_inside_loop-a0c3a3.dot: No application is registered as handling this file

For me, it created /tmp/cfgnon_loop_inside_loop-a0c3a3.dot (you won't necessarily have the same name, that's ok). Then, it will try to find
default program to open it (with the xdg-open command), which as you can see, for me it didn't work. Basically, you now have to have a program that understands
that file. The dot app will do the job, but you'll probably won't have it by default. Search online for how to install graphviz package. For example, in Ubuntu I think you can do it with sudo apt-get install graphviz.
Finally, you should be able to create a PDF out of the .dot file as:
dot -Tpdf /tmp/cfgnon_loop_inside_loop-a0c3a3.dot -o <pdf_filename>.pdf
Then, you can open the pdf and expect to see sth like this: https://imgur.com/a/oW2xgNt
I think viewing CFGs like that, at times can be very helpful. In this case for example, it becomes instantly apparent that the for and the while don't form a SCC.

wow, that grapical approach is beautiful , thanks

regarding the old approach i tested it with the if-then-else test i listed before

define i32* @external_sink_ret2_nrw(i32* %n0, i32* %r0, i32* %w0) {
entry:
  %tobool = icmp ne i32* %n0, null
  br i1 %tobool, label %if.end, label %if.then

if.then:                                          ; preds = %entry
  br label %return

if.end:                                           ; preds = %entry
  %0 = load i32, i32* %r0, align 4
  store i32 %0, i32* %w0, align 4
  br label %return

return:                                           ; preds = %if.end, %if.then
  ret i32* %w0
}

and it seems to work fine with it after adding the analysis and maxTripCount part but it fails in another tests , I will work on them tmw and submit another diff

btw, I recommend that you try to answer questions by following the code - it's always a learning experience).

Since we took that road, let's have a little more fun. Let's start simple by printing the SCCs we get. A simple way to do that is with sth like:

for (scc_iterator<Function *> It = scc_begin(&F), IE = scc_end(&F); It != IE;
     ++It) {

  const std::vector<BasicBlock *> &SCCBBs = *It;
  for (BasicBlock *BB : SCCBBs) {
    dbgs() << *BB << "\n";
  }
  dbgs() << "----- END -----\n\n\n\n\n";
}

But, we can actually do better if we take a graphical view of the CFG, with view-cfg. You can do that with sth like:
./bin/opt -view-cfg test.ll
Assuming that you are in the llvm-project dir, that you have built opt and that your file is test.ll. Now, this will generate a .dot file.
This is a special "graphics" format for which we should not care about right now. If you do that, you'll probably see something like:

...
Writing '/tmp/cfgnon_loop_inside_loop-a0c3a3.dot'...  done. 
Trying 'xdg-open' program... Remember to erase graph file: /tmp/cfgnon_loop_inside_loop-a0c3a3.dot
gio: file:///tmp/cfgnon_loop_inside_loop-a0c3a3.dot: No application is registered as handling this file

For me, it created /tmp/cfgnon_loop_inside_loop-a0c3a3.dot (you won't necessarily have the same name, that's ok). Then, it will try to find
default program to open it (with the xdg-open command), which as you can see, for me it didn't work. Basically, you now have to have a program that understands
that file. The dot app will do the job, but you'll probably won't have it by default. Search online for how to install graphviz package. For example, in Ubuntu I think you can do it with sudo apt-get install graphviz.
Finally, you should be able to create a PDF out of the .dot file as:
dot -Tpdf /tmp/cfgnon_loop_inside_loop-a0c3a3.dot -o <pdf_filename>.pdf
Then, you can open the pdf and expect to see sth like this: https://imgur.com/a/oW2xgNt
I think viewing CFGs like that, at times can be very helpful. In this case for example, it becomes instantly apparent that the for and the while don't form a SCC.

wow, that grapical approach is beautiful , thanks

No problem :)

regarding the old approach i tested it with the if-then-else test i listed before

define i32* @external_sink_ret2_nrw(i32* %n0, i32* %r0, i32* %w0) {
entry:
  %tobool = icmp ne i32* %n0, null
  br i1 %tobool, label %if.end, label %if.then

if.then:                                          ; preds = %entry
  br label %return

if.end:                                           ; preds = %entry
  %0 = load i32, i32* %r0, align 4
  store i32 %0, i32* %w0, align 4
  br label %return

return:                                           ; preds = %if.end, %if.then
  ret i32* %w0
}

and it seems to work fine with it after adding the analysis and maxTripCount part but it fails in another tests , I will work on them tmw and submit another diff

Ok, good. I'll check the method again tomorrow.

omarahmed updated this revision to Diff 246332.Feb 24 2020, 4:06 PM

[Attributor]Detect functions with unbounded loops

This patch changes the approach of containsCycle function of looping on
the CFG from dfs to bfs as the dfs approach detects if-then-else part in
functions as a loop and that result in wrong behaviour, so bfs solves that
problem by only moving level by level on the CFG and only detects loops when
there is an edge to a BB in a visited level.
This patch also uses maxTripCount to detect unbounded loops in function.
It also contains some fixed tests and some added tests contain bounded and
unbounded loops.

omarahmed retitled this revision from [Attributor] Detect SCCs with unbounded cycles to [Attributor]Detect functions with unbounded loops.Feb 24 2020, 4:07 PM
omarahmed edited the summary of this revision. (Show Details)
lebedev.ri added inline comments.Feb 25 2020, 12:02 AM
llvm/lib/Transforms/IPO/Attributor.cpp
2417–2418

I'm a little bit lost here, if a loop is not known to have SmallConstantMaxTripCount,
why does that mean it's unbounded?

baziotis added a comment.EditedFeb 25 2020, 2:31 AM

[Attributor]Detect functions with unbounded loops

Please change that to "[Attributor] Detect possibly unbounded cycles in functions"

This patch changes the approach of containsCycle function of looping on
the CFG from dfs to bfs as the dfs approach detects if-then-else part in
functions as a loop and that result in wrong behaviour, so bfs solves that
problem by only moving level by level on the CFG and only detects loops when
there is an edge to a BB in a visited level.
This patch also uses maxTripCount to detect unbounded loops in function.
It also contains some fixed tests and some added tests contain bounded and
unbounded loops.

This message is good, but we will make it better. For now, certainly the old method
is wrong, I just saw it, but also this method is wrong.
First of all, the old method is wrong because of the example you posted. Generally,
that's not a good way to test cycles because the idea is to do a sort of O(n^2), which means
marking nodes visited but then marking them unvisited. You have to do in a kind of recursive
algorithm where as you forward (and deeper in the recursion) you mark them and as you come
back (returning from the recursion) you unmark them. I'll come back to this in a bit.

Regarding your method, consider this case:

define i32* @test2(i32* %n0, i32* %r0, i32* %w0) {
entry:
  %tobool = icmp ne i32* %n0, null
  br i1 %tobool, label %l2, label %l1

l1:
  br label %l2

l2:
  br label %return

return:
  ret i32* %w0
}

Visually (and you can see this with view-cfg) it looks like that: https://imgur.com/a/n8iG7Ig (using my awful drawing skills)
And we construct the branch to first visit the l2 (by putting in the true label of br) so that when l1 does BFS.
It detects it as a cycle but it's not.
Certainly, BFS is not a popular way to do cycle-detection and I'd abstain from that as all the algorithmic literature
that I know of uses some sort of DFS for this purpose.
So, for now, and we may be able to make it better, you can use something like this:

bool containsCycle(BasicBlock *BB, SmallPtrSet<BasicBlock *, 32> &Visited, SmallPtrSet<BasicBlock *, 32> &Processed) {
  if (Processed.count(BB)) {
    Visited.erase(BB);
    return false;
  }
  Processed.insert(BB);
  Visited.insert(BB);
  for (auto *SuccBB : successors(BB)) {
    if (!Processed.count(SuccBB) && containsCycle(SuccBB, Visited, Processed))
      return true;
    else if (Visited.count(SuccBB))
      return true;
  }
  Visited.erase(BB);
  return false;
}

This function works as a utility and you initialize the 2 sets in the main function.
I wrote this quickly but I think it works. Note the usage of the 2 sets:

  • Processed helps us not do the same work for the same starting BB. It does not help us identify cycles.
  • This is done by the Visited set. The trick here is that when returning from the recursion, we mark the nodes unvisited, so we can start search in a different tree.

Edit: I'll come back with how we can make maxTripCount fit into this.

llvm/lib/Transforms/IPO/Attributor.cpp
2417–2418

It doesn't. In a previous review the name mayContainUnboundedCycle was proposed IIRC, which I prefer.

2417–2418

Here, you want something clearer to what the function does. I'd say, you can go with:
"Helper function that checks whether a function has any cycle which we don't know is bounded.
Loops with maximum trip count are considered bounded, any other cycle not."
We may have to revise this further but it seems a good start.

omarahmed retitled this revision from [Attributor]Detect functions with unbounded loops to [Attributor] Detect possibly unbounded cycles in functions.Feb 25 2020, 8:13 AM

Edit: I'll come back with how we can make maxTripCount fit into this.

So, I'm back. Before continuing, please tell me if everything in this algorithm was clear.
If there are any questions, please ask. So, this algorithm should be strictly better
than the original but still we haven't used LoopInfo and SCEV.

Here's another quick version that I think does that. Btw, please don't code anything yet
because especially after my last mistake I want to be extra sure that what I propose
is totally correct. Here we go:

Ok, in the version I posted above, in the else if, where we check if the successor is
already visited, if it is true, we just found a cycle. But, how do we check
whether this cycle is a loop or not? One way is to getLoopFor() for a node of
the cycle and if we do get a Loop back, check if the size of the loop with the size
of the cycle is the same. If yes, it is a loop, specifically the one we got. This is a little
subtle, so to understand this, think that a loop is always a SCC (same for the cycle). Now,
2 SCCs are either completely disjoint (no common node) or they make one big SCC.
Since the loop and the cycle do have one common node (the one we used in getLoopFor()),
they are in the same SCC. So, if the cycle and the loop have the same number of nodes,
then they're the same thing.
But, how do we get the number of nodes?
Well, in the else if, where we find a cycle, one start of the cycle is
SuccBB. The thing is, going up the recursion, we have already seen this
node (that's why it's visited). So, going down the recursion (i.e. returning)
we're going to see it again. If we save what is this starting node and count
how many nodes we go down until we find it again, this is the size of the cycle.
The code looks something like this:

static SmallPtrSet<BasicBlock *, 32> *Visited;
static SmallPtrSet<BasicBlock *, 32> *Processed;
static BasicBlock *StartBB;
static unsigned BBCount;
bool containsCycleUtil(BasicBlock *BB) {
  if (Processed->count(BB)) {
    Visited->erase(BB);
    return false;
  }
  Processed->insert(BB);
  Visited->insert(BB);
  for (auto *SuccBB : successors(BB)) {
    if (!Processed->count(SuccBB) && containsCycleUtil(SuccBB)) {
      ++BBCount;
      // Continue going down, we haven't found
      // the start yet.
      if (StartBB != BB) {
        return true;
      }
    } else if (Visited->count(SuccBB)) {
      // Found cycle, save the start.
      StartBB = SuccBB;
      ++BBCount;
      return true;
    }
  }
  Visited->erase(BB);
  return false;
}

(I just moved the invariants to globals so that they're not passed
back and forth in the recursion).

What remains is to utilize the LoopInfo. I have
an algorithm but before doing that we have to solve an important problem. I don't seem to be able
to get LoopInfo, it's always null. I found that I probably
need to add -loops -analyze in the command-line but still
it doesn't work. @jdoerfert any idea?

omarahmed marked an inline comment as done.EditedFeb 25 2020, 11:00 AM
bool containsCycle(BasicBlock *BB, SmallPtrSet<BasicBlock *, 32> &Visited, SmallPtrSet<BasicBlock *, 32> &Processed) {
  if (Processed.count(BB)) {
    Visited.erase(BB);
    return false;
  }
  Processed.insert(BB);
  Visited.insert(BB);
  for (auto *SuccBB : successors(BB)) {
    if (!Processed.count(SuccBB) && containsCycle(SuccBB, Visited, Processed))
      return true;
    else if (Visited.count(SuccBB))
      return true;
  }
  Visited.erase(BB);
  return false;
}

what i get is that we will implement dfs by our hand to detect loops so can't we use stack method in dfs that will provide us to not split the function into 2 and keep all the logic inside contains cycle

bool containsCycle(BasicBlock *BB, SmallPtrSet<BasicBlock *, 32> &Visited, SmallPtrSet<BasicBlock *, 32> &Processed) {
  if (Processed.count(BB)) {
    Visited.erase(BB);
    return false;
  }
  Processed.insert(BB);
  Visited.insert(BB);
  for (auto *SuccBB : successors(BB)) {
    if (!Processed.count(SuccBB) && containsCycle(SuccBB, Visited, Processed))
      return true;
    else if (Visited.count(SuccBB))
      return true;
  }
  Visited.erase(BB);
  return false;
}

what i get is that we will implement dfs by our hand

Yes

to detect SCCs

Cycles, it's a little bit different.

so can't we use stack method in dfs that will provide us to not split the function into 2 and keep all the logic inside contains cycle

Yes you can. But note that the runtime stack (which is implemented in the hardware) is quite faster than a software stack.
OTOH, a software stack is not as limited to its depth as a software stack. But, given that we pass a single pointer, it should
be able to handle quite a big number of blocks.
It's an implementation detail and not the most important thing. That would be the algorithm. I'd prefer the recursion but we can sure ask @jdoerfert on this.

I am sorry but I can't get what the test that the approach of moving on the SCCs fails in as I have tried to test it a bit with tests like that and printed the SCCs that it gets

https://i.imgur.com/fpZhpST.png
1 - entry
2- condition - body
3- return
4 - l4

https://i.imgur.com/UJJ3aH0.png
1- entry
2- l1 - l2 - l3 - l4 - l5
3- return
4- l6 - l7 - l8

https://i.imgur.com/92NS6rp.png
1- entry
2- l1 - l2 - l3 - l4 - l5 - l6 - l7
3- return

and in all of them it reached the cycle and outputs that the loop does not have a max trip count

I am sorry but I can't get what the test that the approach of moving on the SCCs fails in as I have tried to test it a bit with tests like that and printed the SCCs that it gets

https://i.imgur.com/fpZhpST.png
1 - entry
2- condition - body
3- return
4 - l4

https://i.imgur.com/UJJ3aH0.png
1- entry
2- l1 - l2 - l3 - l4 - l5
3- return
4- l6 - l7 - l8

https://i.imgur.com/92NS6rp.png
1- entry
2- l1 - l2 - l3 - l4 - l5 - l6 - l7
3- return

and in all of them it reached the cycle and outputs that the loop does not have a max trip count

Sorry, but I didn't understand your comment very much. The code I wrote certainly can't detect max trip count.
It only correctly (hopefully) finds if there's a cycle or not. The second modification also found the size of the cycle.
Did you write any code that used LI ? In that case, as I said earlier, I have some algorithm in mind that uses LI
but I can't seem to be able to get LI (i.e. it's always nullptr). So, probably the same will be true for you
even if your code is correct. Hopefully we can get an answer about why that happens.

I am sorry but I can't get what the test that the approach of moving on the SCCs fails in as I have tried to test it a bit with tests like that and printed the SCCs that it gets

https://i.imgur.com/fpZhpST.png
1 - entry
2- condition - body
3- return
4 - l4

https://i.imgur.com/UJJ3aH0.png
1- entry
2- l1 - l2 - l3 - l4 - l5
3- return
4- l6 - l7 - l8

https://i.imgur.com/92NS6rp.png
1- entry
2- l1 - l2 - l3 - l4 - l5 - l6 - l7
3- return

and in all of them it reached the cycle and outputs that the loop does not have a max trip count

Sorry, but I didn't understand your comment very much. The code I wrote certainly can't detect max trip count.
It only correctly (hopefully) finds if there's a cycle or not. The second modification also found the size of the cycle.
Did you write any code that used LI ? In that case, as I said earlier, I have some algorithm in mind that uses LI
but I can't seem to be able to get LI (i.e. it's always nullptr). So, probably the same will be true for you
even if your code is correct. Hopefully we can get an answer about why that happens.

I mean the code you suggested that loops on SCCs using SCCiter then it appeared that tarjan only detects maximal SCCs, the one that uses tarjan algorithm :)

I am sorry but I can't get what the test that the approach of moving on the SCCs fails in as I have tried to test it a bit with tests like that and printed the SCCs that it gets

https://i.imgur.com/fpZhpST.png
1 - entry
2- condition - body
3- return
4 - l4

https://i.imgur.com/UJJ3aH0.png
1- entry
2- l1 - l2 - l3 - l4 - l5
3- return
4- l6 - l7 - l8

https://i.imgur.com/92NS6rp.png
1- entry
2- l1 - l2 - l3 - l4 - l5 - l6 - l7
3- return

and in all of them it reached the cycle and outputs that the loop does not have a max trip count

Can u share the IR (with the appropriate RUN lines, thus tests) not the images please?

I am sorry but I can't get what the test that the approach of moving on the SCCs fails in as I have tried to test it a bit with tests like that and printed the SCCs that it gets

https://i.imgur.com/fpZhpST.png
1 - entry
2- condition - body
3- return
4 - l4

https://i.imgur.com/UJJ3aH0.png
1- entry
2- l1 - l2 - l3 - l4 - l5
3- return
4- l6 - l7 - l8

https://i.imgur.com/92NS6rp.png
1- entry
2- l1 - l2 - l3 - l4 - l5 - l6 - l7
3- return

and in all of them it reached the cycle and outputs that the loop does not have a max trip count

Sorry, but I didn't understand your comment very much. The code I wrote certainly can't detect max trip count.
It only correctly (hopefully) finds if there's a cycle or not. The second modification also found the size of the cycle.
Did you write any code that used LI ? In that case, as I said earlier, I have some algorithm in mind that uses LI
but I can't seem to be able to get LI (i.e. it's always nullptr). So, probably the same will be true for you
even if your code is correct. Hopefully we can get an answer about why that happens.

I mean the code you suggested that loops on SCCs using SCCiter then it appeared that tarjan only detects maximal SCCs, the one that uses tarjan algorithm :)

With that code the reason you don't find any max trip count as I said is that you probably don't have LoopInfo. Running the examples,
I could not get LoopInfo in any way, even with -loops -analyze. That's a problem that we have to fix so that the new algorithm can be improved
with LoopInfo as well.

uenoku added a comment.EditedFeb 26 2020, 2:38 AM

Could you make sure that you are using a new pass manager? (It means to use ./opt -passes=attributor .. but not ./opt -attributor ..)
If you are using an old pass manager, you might not get LoopInfo or other analysis.

omarahmed added a comment.EditedFeb 26 2020, 2:57 AM

I am sorry but I can't get what the test that the approach of moving on the SCCs fails in as I have tried to test it a bit with tests like that and printed the SCCs that it gets

https://i.imgur.com/fpZhpST.png
1 - entry
2- condition - body
3- return
4 - l4

https://i.imgur.com/UJJ3aH0.png
1- entry
2- l1 - l2 - l3 - l4 - l5
3- return
4- l6 - l7 - l8

https://i.imgur.com/92NS6rp.png
1- entry
2- l1 - l2 - l3 - l4 - l5 - l6 - l7
3- return

and in all of them it reached the cycle and outputs that the loop does not have a max trip count

Can u share the IR (with the appropriate RUN lines, thus tests) not the images please?

okay , sorry for this anonymous comment

the first IR :

define i32* @test1(i32* %n0, i32* %r0, i32* %w0) {
entry:
	%tmp = add i32 0, 0
	br label %condition

condition:
	%tobool2 = icmp ne i32* %n0, null
    br i1 %tobool2, label %body, label %return

body:
	%tobool3 = icmp ne i32* %n0, null
    br i1 %tobool3, label %l4, label %condition	

l4:
	br label %l4

return:                                           
  ret i32* %w0
}

the second IR :

define i32* @test2(i32* %n0, i32* %r0, i32* %w0) {
entry:
	br label %l1
l1:
	br label %l2
l2:
	br label %l3
l3:
	%tobool = icmp ne i32* %n0, null
    br i1 %tobool, label %return, label %l4
l4:
	%tobool2 = icmp ne i32* %n0, null
    br i1 %tobool2, label %l6, label %l5
l5:
	br label %l1
l6:
	br label %l7
l7:
	br label %l8
l8:
	br label %l6

return:                                           
  ret i32* %w0
}

the third IR :

define i32* @test3(i32* %n0, i32* %r0, i32* %w0) {
entry:
	br label %l1
l1:
	br label %l2
l2:
	br label %l3
l3:
	%tobool = icmp ne i32* %n0, null
    br i1 %tobool, label %return, label %l4
l4:
	%tobool2 = icmp ne i32* %n0, null
    br i1 %tobool2, label %l6, label %l5
l5:
	br label %l1
l6:
	br label %l7
l7:
	br label %l4

return:                                           
  ret i32* %w0
}

the run command :

./bin/opt -passes=attributor --attributor-disable=false /home/omar/mytest.ll -S

the code I tested it on :

static bool containsUnboundedCycle(Function &F, Attributor &A) {
  bool NoAnalysis = false;

  ScalarEvolution *SE =
      A.getInfoCache().getAnalysisResultForFunction<ScalarEvolutionAnalysis>(F);
  LoopInfo *LI = A.getInfoCache().getAnalysisResultForFunction<LoopAnalysis>(F);
  if (!LI || !SE){
    NoAnalysis = true;
    dbgs() << "no analysis in function\n";
  }

  for (scc_iterator<Function *> It = scc_begin(&F), IE = scc_end(&F); It != IE;
       ++It) {
    dbgs()<<"SCC start\n\n\n";
    for(auto SCCBB : *It){
      dbgs()<< *SCCBB <<"\n";
    }
    bool SCCHasLoop = It.hasLoop();
    if (!SCCHasLoop){
      dbgs()<<"SCC does not have loop\n";
      continue;
    }
    // When NoAnalysis available then check only if the SCC has loop or not.
    if (NoAnalysis && SCCHasLoop){
      dbgs() << "no analysis and does have loop\n";
      return true;
    }

    const std::vector<BasicBlock *> &SCCBBs = *It;
    // If any random block in this SCC does not belong to a loop, then this SCC
    // is definitely not a loop.
    Loop *L = LI->getLoopFor(SCCBBs.front());
    if (!L){
      dbgs() << "loop is null so current SCC does not have loop\n";
      return true;
    }

    // L is the innermost loop that has a common block with the SCC. Since a
    // loop is always an SCC, if their number of blocks are equal, the SCC is a
    // loop so we check if it is bounded or Unbounded loop by checking the
    // maxTripCount. Otherwise, there are 2 cases:
    // - If the SCC has less blocks, then it is definitely not a loop.
    // - If it has more, then we can't decide since the SCC can be a parent loop
    //   of L. So, we perform the same test for the parent of L.
    do {
      if (L->getNumBlocks() > SCCBBs.size()){
        dbgs() << "SCC is less so not loop\n";
        return true;
      }

      if (L->getNumBlocks() == SCCBBs.size())
        if (!SE->getSmallConstantMaxTripCount(L)){
          dbgs() << "there is no max trip count for the current SCC\n";
          return true;
        }
        dbgs() << "loop with max trip count\n";
        break;

    } while ((L = L->getParentLoop()));

    // Check if L is null, we found no loop that matches exactly the number of
    // blocks of the SCC and so the SCC is not a loop.
     if (!L)
      return true;
  }
  return false;
}

and in the three tests it gets me "there is no max trip count for the current SCC\n" and return true
so I what I was asking is that I can't figure out what is the test that would fail in this code ?
more explanation :
I can't figure out this test to try it and see why it is failing in it :)

That means, if a loop has an SCC inside it, we'll never find it since the loop is the maximal SCC.

baziotis added a comment.EditedFeb 26 2020, 12:17 PM

Could you make sure that you are using a new pass manager? (It means to use ./opt -passes=attributor .. but not ./opt -attributor ..)
If you are using an old pass manager, you might not get LoopInfo or other analysis.

Thank you mate, I didn't know I was using the old pass manager.

@omarahmed I would advise you to not work anymore on the old algorithm as it's certainly not effective. I'm sorry, but
I barely have time to work on the new algorithm :/ Otherwise, I would try to answer it.

So, here's a new algorithm with a couple of debug prints:

static DenseMap<BasicBlock *, unsigned> VisitingTime;
bool containsCycleUtil(BasicBlock *BB) {
  dbgs() << "Visiting: " << BB->getName() << "\n";
  unsigned VisitingTimeCurrent = VisitingTime[BB];
  for (auto *SuccBB : successors(BB)) {
    dbgs() << BB->getName() << " -> " << SuccBB->getName() << "\n\n";
    unsigned &VisitingTimeSucc = VisitingTime[SuccBB];
    if (!VisitingTimeSucc) {
      VisitingTimeSucc = VisitingTimeCurrent + 1;
      containsCycleUtil(SuccBB);
      VisitingTimeSucc = 0;
    } else {
      dbgs() << "Found cycle: " << VisitingTimeCurrent << ", " << VisitingTimeSucc << "\n";
      assert(VisitingTimeCurrent >= VisitingTimeSucc);
      unsigned CycleSize = VisitingTimeCurrent - VisitingTimeSucc + 1;
      dbgs() << "Cycle size: " << CycleSize << "\n";
    }
  }
  return false;
}

static bool containsCycle(Attributor &A, Function &F) {
  BasicBlock *EntryBB = &F.getEntryBlock();
  VisitingTime[EntryBB] = 1;
  return containsCycleUtil(EntryBB);
}

The main modification compared to the last one is that I track visiting time
instead of going back the recursion. This makes the algorithm so much
cleaner (and reliable, for reasons that are not on topic).
Now, this algorithm, AFAICT, finds all cycles, without false positives. Plus, it detects
only statically reachable cycles. That's good news, it should be strictly better than the initial.

Naively, one could then say "let's apply the same idea as with the SCCs algorithm.
This does not fully work though because check this:

define void @loop_with_if(i1 %c1, i1 %c2) {
entry:
  br label %w1

w1:
  br i1 %c1, label %b1, label %exit

b1:
  br label %if

if:
  br i1 %c2, label %t1, label %e1

t1:
  br label %latch

e1:
  br label %latch

latch:
  br label %w1

exit:
  ret void
}

If you pay attention, it's a while with an if inside. If you draw out the CFG, you can
see that there's a cycle: w1 -> b1 -> if -> e1 -> latch -> w1
which the algorithm will (correctly) find. The cycle size is 5 but the loop size is 6.
So, in such loops, even if SCEV can deduce a max trip count (which is possible),
the initial method will fail to realize this is one loop.

Which makes a point: There can be a cycle, which is not a loop, which still does not
disrupt the maximum trip count guarantee (interestingly, that means that even if
SCCIterator gave us all the SCCs, we would stumble upon this problem).

Edit: Well, that was obvious before but not quite in the same context.

And there can be many many such real-world cases.
From this point on, the problem I think starts to get quite complicated. It seems we
need to start saying "if all the blocks of the cycle belong to the same loop and
these blocks contain the header and the latch of the loop blah.. blah.." then
this cycle can be skipped.

IMHO, we should maybe just live with a not-so-perfect-algorithm, or decide
that it is going to take a good amount of time to get it right.

And sorry for the long posts, but this problem turned out to be more complicated
that it initially seemed. I hope I helped a bit.

Could you make sure that you are using a new pass manager? (It means to use ./opt -passes=attributor .. but not ./opt -attributor ..)
If you are using an old pass manager, you might not get LoopInfo or other analysis.

Thank you mate, I didn't know I was using the old pass manager.

@omarahmed I would advise you to not work anymore on the old algorithm as it's certainly not effective. I'm sorry, but
I barely have time to work on the new algorithm :/ Otherwise, I would try to answer it.

So, here's a new algorithm with a couple of debug prints:

static DenseMap<BasicBlock *, unsigned> VisitingTime;
bool containsCycleUtil(BasicBlock *BB) {
  dbgs() << "Visiting: " << BB->getName() << "\n";
  unsigned VisitingTimeCurrent = VisitingTime[BB];
  for (auto *SuccBB : successors(BB)) {
    dbgs() << BB->getName() << " -> " << SuccBB->getName() << "\n\n";
    unsigned &VisitingTimeSucc = VisitingTime[SuccBB];
    if (!VisitingTimeSucc) {
      VisitingTimeSucc = VisitingTimeCurrent + 1;
      containsCycleUtil(SuccBB);
      VisitingTimeSucc = 0;
    } else {
      dbgs() << "Found cycle: " << VisitingTimeCurrent << ", " << VisitingTimeSucc << "\n";
      assert(VisitingTimeCurrent >= VisitingTimeSucc);
      unsigned CycleSize = VisitingTimeCurrent - VisitingTimeSucc + 1;
      dbgs() << "Cycle size: " << CycleSize << "\n";
    }
  }
  return false;
}

static bool containsCycle(Attributor &A, Function &F) {
  BasicBlock *EntryBB = &F.getEntryBlock();
  VisitingTime[EntryBB] = 1;
  return containsCycleUtil(EntryBB);
}

The main modification compared to the last one is that I track visiting time
instead of going back the recursion. This makes the algorithm so much
cleaner (and reliable, for reasons that are not on topic).
Now, this algorithm, AFAICT, finds all cycles, without false positives. Plus, it detects
only statically reachable cycles. That's good news, it should be strictly better than the initial.

Naively, one could then say "let's apply the same idea as with the SCCs algorithm.
This does not fully work though because check this:

define void @loop_with_if(i1 %c1, i1 %c2) {
entry:
  br label %w1

w1:
  br i1 %c1, label %b1, label %exit

b1:
  br label %if

if:
  br i1 %c2, label %t1, label %e1

t1:
  br label %latch

e1:
  br label %latch

latch:
  br label %w1

exit:
  ret void
}

If you pay attention, it's a while with an if inside. If you draw out the CFG, you can
see that there's a cycle: w1 -> b1 -> if -> e1 -> latch -> w1
which the algorithm will (correctly) find. The cycle size is 5 but the loop size is 6.
So, in such loops, even if SCEV can deduce a max trip count (which is possible),
the initial method will fail to realize this is one loop.

Which makes a point: There can be a cycle, which is not a loop, which still does not
disrupt the maximum trip count guarantee (interestingly, that means that even if
SCCIterator gave us all the SCCs, we would stumble upon this problem).

Edit: Well, that was obvious before but not quite in the same context.

And there can be many many such real-world cases.
From this point on, the problem I think starts to get quite complicated. It seems we
need to start saying "if all the blocks of the cycle belong to the same loop and
these blocks contain the header and the latch of the loop blah.. blah.." then
this cycle can be skipped.

IMHO, we should maybe just live with a not-so-perfect-algorithm, or decide
that it is going to take a good amount of time to get it right.

And sorry for the long posts, but this problem turned out to be more complicated
that it initially seemed. I hope I helped a bit.

I now understand the problem clearly , Thank you really , u helped me a lot :)
I will try to submit a diff tmw with this new algorithm with adding a doc :)

baziotis added a comment.EditedFeb 26 2020, 1:50 PM

I now understand the problem clearly , Thank you really , u helped me a lot :)

No problem! If you have any questions, please ask.

I will try to submit a diff tmw with this new algorithm with adding a doc :)

You may want to submit it initially without the LoopInfo.

omarahmed added a comment.EditedFeb 28 2020, 11:42 AM

I tried to use loopInfo on the algorithm , I have came with 2 approaches :
the first is that if we found a cycle we add the BB in the cycle in a allCyclesBBs vector and after we return from containCycleUtil with all the BBs that makes cycles then loop on the allCyclesBBs vector in the containsCycle function and get the information of the loop with getloopfor as we was doing or ask this inside the containCycleUtil and make this function return us if there was a cycle with no maxTripCount
the second is that we do this dfs in a stack style and if we find a cycle we ask for loop info and do the same checks as before like maxtripCount
I don't know which of them is better ?

also the algorithm sometimes in cases moves on BBs 2 times like this case here :

define i32* @external_sink_ret2_nrw(i32* %n0, i32* %r0, i32* %w0) {
entry:
  %tobool = icmp ne i32* %n0, null
  br i1 %tobool, label %if.end, label %if.then

if.then:                                          ; preds = %entry
  %tobool2 = icmp ne i32* %n0, null
  br i1 %tobool2, label %return, label %if.end

if.end:                                           ; preds = %entry
  br label %return

return:                                           ; preds = %if.end, %if.then
  ret i32* %w0
}

so I think if we keep not only the visiting time but the start time = visitingTime and finish time of each node we have visited then we can detect a cycle if we reach a node with no finishTime that means that this node we have visited but not returned from it to make it finish , so this makes the complexity in all the situations O( N+E )

another question Is it normal to make global variables as I have not found any in the code ?

I tried to use loopInfo on the algorithm , I have came with 2 approaches :
the first is that if we found a cycle we add the BB in the cycle in a allCyclesBBs vector and after we return from containCycleUtil with all the BBs that makes cycles then loop on the allCyclesBBs vector in the containsCycle function and get the information of the loop with getloopfor as we was doing or ask this inside the containCycleUtil and make this function return us if there was a cycle with no maxTripCount
the second is that we do this dfs in a stack style and if we find a cycle we ask for loop info and do the same checks as before like maxtripCount
I don't know which of them is better ?

The second one unless I miss something. Oh, I just saw I didn't post the code with LoopInfo. But in any case,
you don't need to save the cycles for any reason cycles.

also the algorithm sometimes in cases moves on BBs 2 times

You probably mean the return label. Yes it does, because you try to find all cycles. Compare this to the initial algorithm in which you find if there's any cycle at all (check the Processed there that is missing here).

like this case here :

define i32* @external_sink_ret2_nrw(i32* %n0, i32* %r0, i32* %w0) {
entry:
  %tobool = icmp ne i32* %n0, null
  br i1 %tobool, label %if.end, label %if.then

if.then:                                          ; preds = %entry
  %tobool2 = icmp ne i32* %n0, null
  br i1 %tobool2, label %return, label %if.end

if.end:                                           ; preds = %entry
  br label %return

return:                                           ; preds = %if.end, %if.then
  ret i32* %w0
}

so I think if we keep not only the visiting time but the start time = visitingTime and finish time of each node we have visited then we can detect a cycle if we reach a node with no finishTime that means that this node we have visited but not returned from it to make it finish , so this makes the complexity in all the situations O( N+E )

another question Is it normal to make global variables as I have not found any in the code ?

Generally, global variables are bad. However, when doing recursion, you don't want to pass parameters that are invariant because the fill the stack without a reason
and limit the recursion depth. So, you have 2 choices. Either you make global variables or you make a class that does the thing you want to do.
Then, the global variables get replaced by member variables. To me, this is mostly a matter of preference. Indeed, there are no global variables in the Attributor generally
but in this case, I don't know what would be preferred. This is the easiest thing to replace, so you can use them for now if you want.
In any case, this patch got an unexpected turn so I'd wait for @jdoerfert to tell us an opinion in general. Give him a little a bit of time since:
a) There's a lot to read.
b) He's travelling.

I haven't read all the comments but the code looks good with one problem, see below.

llvm/lib/Transforms/IPO/Attributor.cpp
2424

Nit: bool NoAnalysis = !SE || !LI;

2441
if (!L || !SE->getSmallConstantMaxTripCount(L))
  return true;

You ignore non-loop cycles otherwise. Please add a non-loop cycle test case, aka. irreducible control flow test case.

Here is a piece of C that should help create the test:

if (foo())
  goto entry1;
else
  goto entry2;

entry1:
  if (foo())
    goto exit;
  else
    goto entry2;
entry2:
  if (foo())
    goto exit;
  else
    goto entry1;
exit:
  return;
baziotis added inline comments.Mar 3 2020, 2:07 PM
llvm/lib/Transforms/IPO/Attributor.cpp
2441

Please note that this code (i.e. the original) is wrong anyway, as for example in this it finds a cycle because
we don't mark nodes unvisited when trying for a new cycle:

define i32* @external_sink_ret2_nrw(i32* %n0, i32* %r0, i32* %w0) {
entry:
  %tobool = icmp ne i32* %n0, null
  br i1 %tobool, label %if.end, label %if.then

if.then:                                          ; preds = %entry
  br label %return

if.end:                                           ; preds = %entry
  %0 = load i32, i32* %r0, align 4
  store i32 %0, i32* %w0, align 4
  br label %return

return:                                           ; preds = %if.end, %if.then
  ret i32* %w0
}

As I noted in earlier comments, to find all cycles correctly, one has to visit but then unvisit nodes when trying a new cycle.
This eventually led to new algorithms which did full cycle detection and I ended up with this algorithm with the caveats mentioned: https://reviews.llvm.org/D74691/new/#1894286

omarahmed added a comment.EditedMar 3 2020, 2:18 PM

Please add a non-loop cycle test case, aka. irreducible control flow test case.

okay I will add it in the next diff :)

jdoerfert added inline comments.Mar 4 2020, 1:49 PM
llvm/lib/Transforms/IPO/Attributor.cpp
2441

Not wrong but too conservative ;) That's a big difference. However, I agree, we should be as precise as is reasonably cheap.

baziotis added inline comments.Mar 4 2020, 3:08 PM
llvm/lib/Transforms/IPO/Attributor.cpp
2441

If you mean that the actual problem is the speed which is a side-effect of conservation, then yes, definitely it's unreasonably slow (given a complete graph, it has sth like exponential time).
Other than that, the fact that a block of a cycle is inside a loop does not mean that the cycle is this loop.
Maybe we can use it if the block is special and it has this specific property. For example, if we knew that the header/latch/... of a loop can't be part of a non-loop cycle. I don't know if something like that holds, I'll try to check tomorrow.

omarahmed updated this revision to Diff 248394.Mar 4 2020, 10:22 PM
omarahmed edited the summary of this revision. (Show Details)EditedMar 4 2020, 10:22 PM

[Attributor]Detect possibly unbounded cycles in functions

This patch changes the approach of containsCycle function of looping on
the CFG by depth_first to another approach as the depth_first function approach detects if-then-else part in
functions as a loop and that result in wrong behaviour, so the new approach detects cycles by using a recursive function that dfs on the CFG from the entry block and detects a cycle when we go in a recursive calls chain and visit one of the nodes of the chain.
This new approach also saves the processed nodes so not compute them two times thus enhancing the complexity of the algorithm to be O(N+E).
The approach also doesn't visit statically unreachable blocks so that also enhances the complexity.
This patch also uses maxTripCount to detect if a function has a cycle,then we check
further whether the cycle is a bounded loop cycle.
Loops with maximum trip count are considered bounded, any other cycle not.
It also contains some fixed tests and some added tests contain bounded and
unbounded loops.

omarahmed updated this revision to Diff 248395.Mar 4 2020, 10:28 PM

[Attributor]Detect possibly unbounded cycles in functions

This patch changes the approach of containsCycle function of looping on
the CFG by depth_first to another approach as the depth_first function approach detects if-then-else part in
functions as a loop and that result in wrong behaviour, so the new approach detects cycles by using a recursive function that dfs on the CFG from the entry block and detects a cycle when we go in a recursive calls chain and visit one of the nodes of the chain.
This new approach also saves the processed nodes so not compute them two times thus enhancing the complexity of the algorithm to be O(N+E).
The approach also doesn't visit statically unreachable blocks so that also enhances the complexity.
This patch also uses maxTripCount to detect if a function has a cycle,then we check
further whether the cycle is a bounded loop cycle.
Loops with maximum trip count are considered bounded, any other cycle not.
It also contains some fixed tests and some added tests contain bounded and
unbounded loops.

[Attributor]Detect possibly unbounded cycles in functions

so the new approach detects cycles by using a recursive function that dfs on the CFG

Note that this algorithm detects whether there is a cycle in the function (correctl). But now
note that if you find a cycle, you may want to skip it. In that case, it is wrong.
Specifically, because of the Processed, say you find a cycle A and you skip it
because it is a loop with max trip count (how do you do that reliably is another story).
There can be a cycle B (in which some nodes of A are contained) that won't be found.
That is a false negative (i.e. we'll say the function is willreturn but it might
not be). I think this is a problem.
I'll have to think if that can happen in the CFG, but from the graph theory perspective,
it definitely can.

This new approach also saves the processed nodes so not compute them two times thus enhancing the complexity of the algorithm to be O(N+E).
The approach also doesn't visit statically unreachable blocks so that also enhances the complexity.

Those 2 were true for the previous algorithm as well. If I'm not mistaken, we're doing strictly more work than the previous algorithm (although
the facts you wrote are correct).

llvm/lib/Transforms/IPO/Attributor.cpp
2439

Note that as I mentioned, if a node of a cycle is within a loop, it doesn't mean it is the loop.

[Attributor]Detect possibly unbounded cycles in functions

so the new approach detects cycles by using a recursive function that dfs on the CFG

Note that this algorithm detects whether there is a cycle in the function (correctl). But now
note that if you find a cycle, you may want to skip it. In that case, it is wrong.
Specifically, because of the Processed, say you find a cycle A and you skip it
because it is a loop with max trip count (how do you do that reliably is another story).
There can be a cycle B (in which some nodes of A are contained) that won't be found.
That is a false negative (i.e. we'll say the function is willreturn but it might
not be). I think this is a problem.
I'll have to think if that can happen in the CFG, but from the graph theory perspective,
it definitely can.

This new approach also saves the processed nodes so not compute them two times thus enhancing the complexity of the algorithm to be O(N+E).
The approach also doesn't visit statically unreachable blocks so that also enhances the complexity.

Those 2 were true for the previous algorithm as well. If I'm not mistaken, we're doing strictly more work than the previous algorithm (although
the facts you wrote are correct).

From what i understaand is that when we have a case like that https://i.imgur.com/cvAjcMa.png and when cycle 1-2-3 have amax trip count and we visit them first we have a problem
but i think it won't make a problem since 2 and 3 which are in the bigger cycle must have edges in the cycle in here is the edge of 3 to 4 so 3 and 2 will not be marked processed until they completly finish the 3 node successors so it will complete in the bigger cycle and detect the cycle . I think marking processed at the end will make us escape from a problem like that

[Attributor]Detect possibly unbounded cycles in functions

so the new approach detects cycles by using a recursive function that dfs on the CFG

Note that this algorithm detects whether there is a cycle in the function (correctl). But now
note that if you find a cycle, you may want to skip it. In that case, it is wrong.
Specifically, because of the Processed, say you find a cycle A and you skip it
because it is a loop with max trip count (how do you do that reliably is another story).
There can be a cycle B (in which some nodes of A are contained) that won't be found.
That is a false negative (i.e. we'll say the function is willreturn but it might
not be). I think this is a problem.
I'll have to think if that can happen in the CFG, but from the graph theory perspective,
it definitely can.

This new approach also saves the processed nodes so not compute them two times thus enhancing the complexity of the algorithm to be O(N+E).
The approach also doesn't visit statically unreachable blocks so that also enhances the complexity.

Those 2 were true for the previous algorithm as well. If I'm not mistaken, we're doing strictly more work than the previous algorithm (although
the facts you wrote are correct).

From what i understaand is that when we have a case like that https://i.imgur.com/cvAjcMa.png and when cycle 1-2-3 have amax trip count and we visit them first we have a problem
but i think it won't make a problem since 2 and 3 which are in the bigger cycle must have edges in the cycle in here is the edge of 3 to 4 so 3 and 2 will not be marked processed until they completly finish the 3 node successors so it will complete in the bigger cycle and detect the cycle . I think marking processed at the end will make us escape from a problem like that

In such scenarios you won't have a problem indeed. What I meant is such scenarios: https://imgur.com/a/Ua8xwhg
Consider the case where you find the cycle 0-1-2-0 but it is a loop with max trip count. Upon returning (from recursion) in 1, you will visit 3. But from 3, you won't visit 1 because it's already processed.
If you're doing cycle detection (i.e. check if a graph has a cycle whatsoever), this is correct because there's no point revisiting processed nodes. But this is not our case because we skip.

So, to sum up: As I mentioned, LoopInfo I think can't be used reliably as it is used now for the points mentioned previously. And we have 2 cases:

  1. If we find a reliable way to skip cycles, in this algorithm we will have false negatives which I think is bad. Maybe better the initial version which while it had false positives, it didn't have false negatives.
  2. If not, this is probably as good as we can get in cycle detection and has a good complexity.

High level comments:

  • Please do not use global variables. If you need a set/map, create it on the stack and pass it as a reference.
  • If the algorithm to distinguish cycles and loops makes problems, we can just go through all loops in loop info and verify there is no irreducible control (see for example mayContainIrreducibleControl in MustExecute.cpp)
  • If the algorithm to distinguish cycles and loops makes problems, we can just go through all loops in loop info and verify there is no irreducible control (see for example mayContainIrreducibleControl in MustExecute.cpp)

That's a great idea! @omarahmed note that LoopInfo gives you the top-level loops (https://llvm.org/docs/LoopTerminology.html#id3). One way to get top-level loops is getLoopsInPreorder(). Then, you can get all the sub-loops of a loop using getSubLoops(). However, note that this will give you the immediate children if I'm not mistaken. That is, say you have a top-level loop A that contains B that contains C. A->getSubLoops() will give you B. You then have to do B->getSubLoops() to get C.
Before you do all that, you check for irreducible control of course.

Other than that, and you may skip that, I think it's interesting to see what happens behind the scenes. mayContainIrreducibleControl() calls containsIrreducibleCFG().
This requires LoopInfo. Apart from its implementation that is interesting, it's also educational to read about how do you test for irreducible control in a CFG (i.e. cycle that is not a Loop in the LLVM terms) without LoopInfo.
Here's the diff in which it was added: https://reviews.llvm.org/D40874#1013599 . Check online for T1 / T2 transformations in control-flow graphs (I just did 'cause of course I didn't know any of that and it's quite fun).

  • If the algorithm to distinguish cycles and loops makes problems, we can just go through all loops in loop info and verify there is no irreducible control (see for example mayContainIrreducibleControl in MustExecute.cpp)

That's a great idea! @omarahmed note that LoopInfo gives you the top-level loops (https://llvm.org/docs/LoopTerminology.html#id3). One way to get top-level loops is getLoopsInPreorder(). Then, you can get all the sub-loops of a loop using getSubLoops(). However, note that this will give you the immediate children if I'm not mistaken. That is, say you have a top-level loop A that contains B that contains C. A->getSubLoops() will give you B. You then have to do B->getSubLoops() to get C.
Before you do all that, you check for irreducible control of course.

Other than that, and you may skip that, I think it's interesting to see what happens behind the scenes. mayContainIrreducibleControl() calls containsIrreducibleCFG().
This requires LoopInfo. Apart from its implementation that is interesting, it's also educational to read about how do you test for irreducible control in a CFG (i.e. cycle that is not a Loop in the LLVM terms) without LoopInfo.
Here's the diff in which it was added: https://reviews.llvm.org/D40874#1013599 . Check online for T1 / T2 transformations in control-flow graphs (I just did 'cause of course I didn't know any of that and it's quite fun).

okay will work on it right away and see how this goes :) , Thanks for pointing me to this great resources will definitely check all of them :D

omarahmed updated this revision to Diff 248960.Mar 7 2020, 12:40 PM

[Attributor] Detect possibly unbounded cycles in functions

This patch changes the approach of containsCycle function of looping on
the CFG by depth_first to another approach as the depth_first function approach detects if-then-else part in
functions as a loop and that result in wrong behaviour, so this approach checks first the analysis of the function if no analysis available for the function then it loops on the SCCs and checkss if the SCC is a cycle or not,this is done with the assumption that any cycle is unbounded.
If analysis is present for the function then we check for irreducible control to check if there is a non loop cycles if there weren't any then we loop on the loop cycles in the function and check the maxTripCount of the loop.
This patch also uses maxTripCount to check if a function have bounded loop cycles.
Loops with maximum trip count are considered bounded, any other cycle not.
It also contains some fixed tests and some added tests contain bounded and
unbounded loops and non-loop cycles.

omarahmed updated this revision to Diff 248963.Mar 7 2020, 1:25 PM

[Attributor] Detect possibly unbounded cycles in functions

This patch changes the approach of containsCycle function of looping on
the CFG by depth_first to another approach as the depth_first function approach detects if-then-else part in
functions as a loop and that result in wrong behaviour, so this approach checks first the analysis of the function if no analysis available for the function then it loops on the SCCs and checkss if the SCC is a cycle or not,this is done with the assumption that any cycle is unbounded.
If analysis is present for the function then we check for irreducible control to check if there is a non loop cycles if there weren't any then we loop on the loop cycles in the function and check the maxTripCount of the loop.
This patch also uses maxTripCount to check if a function have bounded loop cycles.
Loops with maximum trip count are considered bounded, any other cycle not.
It also contains some fixed tests and some added tests contain bounded and
unbounded loops and non-loop cycles.

[Attributor] Detect possibly unbounded cycles in functions

This patch changes the approach of containsCycle function of looping on
the CFG by depth_first to another approach as the depth_first function approach detects if-then-else part in
functions as a loop and that result in wrong behaviour, so this approach checks first the analysis of the function if no analysis available for the function then it loops on the SCCs and checkss if the SCC is a cycle or not,this is done with the assumption that any cycle is unbounded.

You don't need anything above here. :)

If analysis is present for the function then we check for irreducible control to check if there is a non loop cycles if there weren't any then we loop on the loop cycles in the function and check the maxTripCount of the loop.

It would be great to fix grammar / syntax a little bit. You can break sentences to make this easier. Also, you can be a little more specific (when it is useful). For example:
If LoopInfo and SCEV analyses are available, then:

  • We can test whether there is irreducible control (i.e. non-loop cycles). If there is, one of those cycles can be infinite, so we return true.
  • Otherwise, loop through all the loops in the function (including child loops) and test whether any of those does not have a maximal trip count (in which case it might be infinite).

Note that the first step can be done more cheaply without requiring LoopInfo. But we use it since it is implicitly needed in the second step.

It also contains some fixed tests and some added tests contain bounded and
unbounded loops and non-loop cycles.

This is good, you can leave that.

Aside from the description, we have to do something when analyses are not available. My proposal is to use the old method, again because it doesn't have false negatives and it's reasonably cheap.
Lastly, I want to reiterate that if there's anything that is unclear, please ask.

llvm/lib/Transforms/IPO/Attributor.cpp
2422–2428

You don't need any of that. This is what was mentioned in a previous diff: Specifically that SCC iterator does not loop through all SCCs, only the top-level one, so you can't do that.

2438–2444

As @jdoerfert mentioned, you can replace this with mayContainIrreducibleControl

2448–2449

As I said in a previous comment, getLoopsInPreorder() (or any way that LoopInfo gives you access to all the loops) gives you the top-level loops. The fact that a top-level loop has a maximal trip count does not mean that its children loops (i.e. loops entirely contained within this loop) will also have a maximal trip count. So, for every top-level loop, you have to check all its children (i.e. getSubLoops()). And all that, in a loop / recursion, because the sub loops may themselves have children loops etc.

omarahmed marked 3 inline comments as done.Mar 7 2020, 5:28 PM
omarahmed added inline comments.
llvm/lib/Transforms/IPO/Attributor.cpp
2422–2428

I tried to do the old method but with no false positives, from what I understand here we wanted only to detect if a cycle exists only and doesn't care if it is a loop or not so I thought that SCCs evenwhen they will give us the top level ones these top level ones will mean that there is a cycle and if there were not these top level SCC structure and it is only consists of single nodes SCCs it will make us not detect a cycle so i used hasCycle function to detect that , does a top level SCC can exist that is not a cycle and a smaller SCC that is a cycle exist in it ?

2438–2444

I found that the function is in mustExcute.cpp only and not available in the header file mustExcute.h so how could i use it in this case i guess it should be added to the header :)

2448–2449

I have tried it with a test like that

for(int i=0;i<n;i++)
    for(int j =0;j<n;j++)
        for(int k = 0;k<n;k++)
             for(int l = 0;l<n;l++)

and printed the loops that it iterate on and i found it had iterated even on the forth level one , also I have looked on the implementation of the cycle and found that it recursivly was going inside the loops so i guess it goes throw all of the loops not only the top-level ones or I miss something here ?

omarahmed marked an inline comment as done.Mar 7 2020, 5:52 PM
omarahmed added inline comments.
llvm/lib/Transforms/IPO/Attributor.cpp
2448–2449

I mean the implementation of the function not cycle it is a typo :)

baziotis added inline comments.Mar 8 2020, 1:37 AM
llvm/lib/Transforms/IPO/Attributor.cpp
2422–2428

Sorry, I wrote these comments quite quickly yesterday. You're right. To answer, at first, no, any SCC unless it is single-node without self-loop. Now, scc_iterator uses Tarjan, which finds all the maximal SCCs. Those are SCCs which are as big as they can get. i.e. in which you can't put any other node and still have an SCC. To detect if there's a cycle, you only need to find the maximal ones.

Also, note that in this case, you could use the algorithm version with Processed. This checked whether there is a cycle in the graph in the same complexity as Tarjan, but it's also simpler. Use what you like. However, if you use the scc_iterator version, we need a good explanation (you can pick stuff from this comment).

2438–2444

Yep, true, it's static. Please add it to the header. :)

2448–2449

Yes, also true, it gives you all the loops in preorder, I just checked the function. So, yes that should work.

bbn added a subscriber: bbn.Mar 8 2020, 8:39 AM
omarahmed marked an inline comment as done.Mar 8 2020, 1:06 PM
omarahmed added inline comments.
llvm/lib/Transforms/IPO/Attributor.cpp
2438–2444

i tried to erase static as static means that it will only used in the cpp file also i have put the definition in the header file but that resulted in an error in linking so how could we do that here ? I guess also that all the functions i saw was only static in the cpp file or they are belong to a class in the .h or cpp file but no function that is implemented alone so I think that getting this 3 lines is more clean or what do you think ?

omarahmed updated this revision to Diff 249021.Mar 8 2020, 4:29 PM

Patch updates :

Check If LoopInfo and SCEV analyses are available, then:

  • We can test whether there is irreducible control (i.e. non-loop cycles). If there is, one of those cycles can be infinite, so we return true.
  • Otherwise, loop through all the loops in the function (including child loops) and test whether any of those does not have a maximal trip count (in which case it might be infinite).

Else if unavailable:

  • We check if any cycle exists then we return true or else we return false.we use scc_iterator to check for cycles.

Note:

  • scc_iterator uses Tarjan, which finds all the maximal SCCs. Those are SCCs which are as big as they can get. i.e. in which you can't put any other node and still have an SCC. To detect if there's a cycle, you only need to find the maximal ones.

why it is failing in harbor master remote builds,I am sure it compiles , passes tests and clang format run on it?
also regarding the tests it gives expected fails =1 in liveness.ll , from what i understand it is expected to fail anyway but is that a problem that needs to be fixed ?

Patch updates :

Check If LoopInfo and SCEV analyses are available, then:

  • We can test whether there is irreducible control (i.e. non-loop cycles). If there is, one of those cycles can be infinite, so we return true.
  • Otherwise, loop through all the loops in the function (including child loops) and test whether any of those does not have a maximal trip count (in which case it might be infinite).

Else if unavailable:

  • We check if any cycle exists then we return true or else we return false.we use scc_iterator to check for cycles.

Note:

  • scc_iterator uses Tarjan, which finds all the maximal SCCs. Those are SCCs which are as big as they can get. i.e. in which you can't put any other node and still have an SCC. To detect if there's a cycle, you only need to find the maximal ones.

That's quite better I think! :) Side note: You can rephrase or change my proposals regarding documentation, commit messages etc. I'm here only to try to make sure that you get the point across (you're also
free to copy exactly what I say of course, pick whatever you like).

why it is failing in harbor master remote builds,I am sure it compiles , passes tests and clang format run on it?

On that part unfortunately I can't help. I'm not familiar with harbomaster. AFAIK, it shouldn't fail. Maybe Johannes or @uenoku can help.

also regarding the tests it gives expected fails =1 in liveness.ll , from what i understand it is expected to fail anyway but is that a problem that needs to be fixed ?

AFAIK, no, this is not a problem. But make sure that you run the whole test suite. That is, probably now you run the tests in llvm/test/Transforms/Attributor/. After verifying that those pass,
you can run all the tests in llvm/test. Note that this should not make a difference since the Attributor is not currently used anywhere. But it's something that should be done in general.

llvm/lib/Transforms/IPO/Attributor.cpp
2438–2444

I think it's better if we use the already made function. I'll try to do that myself tomorrow and see what problems come up.

[Irrelevant] Some more info about static: It means that you can use the function only in this compilation unit. Usually, a compilation unit is a .cpp file, but if you include a .cpp file in another, those 2 will make one compilation unit.
As a side note, static has more to do with linking than with compiling. It basically means that the symbol (i.e. the function, which in assembly is just a label) is not exported. So, no code from another object file can jump to it.
Here's a great video in case you're interested in this kind of stuff: https://www.youtube.com/watch?v=n4fI4eUTTKM

baziotis added inline comments.Mar 8 2020, 7:32 PM
llvm/lib/Transforms/IPO/Attributor.cpp
2448

After a little bit of search, clang-tidy fails in the Harbomaster because you can declare L as auto *L (instead of auto L).
You can see the failed check by clicking on the Harbomaster failed build (https://reviews.llvm.org/B48509), then "pre-merge checks"(https://reviews.llvm.org/harbormaster/build/52266/), then "External Link clang-tidy linux" (https://results.llvm-merge-guard.org/amd64_debian_testing_clang8-4435/clang-tidy.txt).

omarahmed marked an inline comment as done.Mar 8 2020, 7:56 PM
omarahmed added inline comments.
llvm/lib/Transforms/IPO/Attributor.cpp
2448

oh sry that was obvious somehow and i didn't saw it,Thanks :)

baziotis added inline comments.Mar 9 2020, 6:01 AM
llvm/lib/Transforms/IPO/Attributor.cpp
2438–2444

So, I had forgotten about namespace llvm. You need to put the declaration of the function inside namespace llvm in MustExecute.h.
Then, in MustExecute.cpp, you need to remove static but add llvm:: in the name, i.e. llvm::mayContainIrreducibleControl(....
In short, the function has to be declared inside the llvm namespace and when you define it, you have to specify that you want to define a function from this specific namespace.

uenoku added inline comments.Mar 9 2020, 9:09 AM
llvm/lib/Transforms/IPO/Attributor.cpp
2424

nit: Normally, the word "strongly connected components" contains its maximality. So I think it's more concise to omit the description for maximality.

baziotis added inline comments.Mar 9 2020, 10:12 AM
llvm/lib/Transforms/IPO/Attributor.cpp
2424

I'm not sure there's a definitive answer here. It depends on the definition. If we use the "reachability" definition, then it's not implied. An SCC can contain an SCC. But if we talk about plural, i.e. SCCs of a graph, the definition contains the term partition. Which implies maximality. I've seen both in books and I believe it's a long discussion in which we don't want to get the reader to. The fact that I didn't remember that scc_iterators finds maximal SCCs led to 3-4 days at least of working on a wrong solution (in this patch actually).

omarahmed updated this revision to Diff 249196.EditedMar 9 2020, 1:02 PM
omarahmed marked an inline comment as done.

Patch updates :

  1. Add mayContainIrreducibleControl() function to the MustExecute.h so it could be used in mayContainUnboundedCycle() function in Attributtor.cpp.
  2. Fix mayContainIrreducibleControl() function as the return value was reversed , it was :
    • Return true if Reducible control.
    • Return false if Irreducible control.
  3. Minor fixes in docs.

I don't know if reversing the condition of mayContainIrreducibleControl is right but it seemed logical to me also I have run llvm\test\analysis tests and it didn't break any test

omarahmed updated this revision to Diff 249263.EditedMar 9 2020, 6:51 PM

Patch updates :

  1. Add mayContainIrreducibleControl() function to the MustExecute.h so it could be used in mayContainUnboundedCycle() function in Attributtor.cpp.
  2. Fix mayContainIrreducibleControl() function as the return value was reversed , it was :
    • Return true if Reducible control.
    • Return false if Irreducible control.
  3. Minor fixes in docs.

Generally it LGTM. I've added some nit comments. Before accepting, I'd like @jdoerfert opinion on the change in mayContainIrreducibleControl().
I think it's a correct change.

llvm/lib/Transforms/IPO/Attributor.cpp
2417–2418

Remove the TODO.
Then please see the comment above.

2422

"If NoAnalysis is available ..." -> "If either SCEV or LoopInfo is not available ..."

2425

"you only need" -> "we only need"

2438

If there's irreducible control, the function may contain non-loop cycles.

2443

Any loop that does not have a max trip count is considered unbounded cycle.

Remove the TODO.
Then please see the comment above.

sry , I don't understand what u mean by the comment above , does u mean your comment to uenoko ?

uenoku added inline comments.Mar 10 2020, 8:52 AM
llvm/lib/Transforms/IPO/Attributor.cpp
2424

Ok, I'm fine with either way :)

Remove the TODO.
Then please see the comment above.

sry , I don't understand what u mean by the comment above , does u mean your comment to uenoko ?

No sorry, I meant this:

Here, you want something clearer to what the function does. I'd say, you can go with:
"Helper function that checks whether a function has any cycle which we don't know is bounded.
Loops with maximum trip count are considered bounded, any other cycle not."
We may have to revise this further but it seems a good start.
llvm/lib/Transforms/IPO/Attributor.cpp
2424

Cool :)

omarahmed updated this revision to Diff 249425.Mar 10 2020, 9:37 AM

Minor cleanups

omarahmed updated this revision to Diff 249428.Mar 10 2020, 9:46 AM

Minor cleanups

Harbormaster completed remote builds in B48706: Diff 249425.
baziotis accepted this revision.Mar 10 2020, 3:48 PM

I added a single nit comment. Otherwise it LGTM! :)

llvm/lib/Transforms/IPO/Attributor.cpp
2418–2434

[Nit][Grammar] "we don't know is bounded" -> "we don't know if it is bounded"

omarahmed updated this revision to Diff 249520.Mar 10 2020, 4:42 PM

Minor cleanup

omarahmed added a comment.EditedMar 10 2020, 4:51 PM

I added a single nit comment. Otherwise it LGTM! :)

Thanks for the great reviews and bearing my small mistakes through this :)

I added a single nit comment. Otherwise it LGTM! :)

Thanks for the great reviews and bearing my small mistakes through this :)

No problem, I hope it helped. :) Let's wait for a couple of days for other reviewers and then I can commit.

Although I just saw we should also change the commit message (i.e. the description).
The following is a quick write that seems fine.

Add mayContainUnboundedCycle which checks if there's a cycle
in a function that we consider unbounded.
All loops with max trip count are considered bounded.
Any other cycle is not.

omarahmed updated this revision to Diff 249592.Mar 11 2020, 5:08 AM

Update commit message

Update commit message

Sorry probably it wasn't clear. I meant that you should also update the description of this patch. :)

omarahmed edited the summary of this revision. (Show Details)Mar 11 2020, 6:01 AM

Update commit message

Sorry probably it wasn't clear. I meant that you should also update the description of this patch. :)

done :) , Does we need more details than that ?

omarahmed edited the summary of this revision. (Show Details)Mar 11 2020, 6:15 AM

Update commit message

Sorry probably it wasn't clear. I meant that you should also update the description of this patch. :)

done :) , Does we need more details than that ?

I think it is ok. Although I haven't seen much the tests. Let's wait a couple of days for another reviewer and we can commit then.

uenoku accepted this revision.Mar 13 2020, 3:45 AM

LGTM

jdoerfert accepted this revision.Mar 13 2020, 7:53 AM

LGTM, I'll commit

This revision is now accepted and ready to land.Mar 13 2020, 7:53 AM
This revision was automatically updated to reflect the committed changes.

Thanks all :D
I am sorry but a message sent to me with this

The Buildbot has detected a new failure on builder fuchsia-x86_64-linux while building llvm.

So does that mean i broke something here ?

uenoku added a comment.EditedMar 13 2020, 10:29 AM

You don't have to care about the failure if the failure doesn't occur multiple times.
Please see https://lists.llvm.org/pipermail/llvm-dev/2019-November/136710.html

fhahn added a subscriber: fhahn.Mar 13 2020, 10:53 AM

You don't have to care about the failure if the failure doesn't occur multiple times.
Please see https://lists.llvm.org/pipermail/llvm-dev/2019-November/136710.html

I think a better way to put it is that you don't have to care about the failure if you did not cause it (e.g. you might not get multiple emails if other bots were broken before)

You don't have to care about the failure if the failure doesn't occur multiple times.
Please see https://lists.llvm.org/pipermail/llvm-dev/2019-November/136710.html

I think a better way to put it is that you don't have to care about the failure if you did not cause it (e.g. you might not get multiple emails if other bots were broken before)

Yes, thank you for the correction.

Got it, Thanks :)