Download Raw Diff

Details

Reviewers

qcolombet
skatkov

Commits

rG67d93f0d9154: [LoopInfo] Limit the iterations to check whether a loop has dedicated exits for…
rL372990: [LoopInfo] Limit the iterations to check whether a loop has dedicated exits

Summary

We had a case that a single loop which has 4000 exits and the average number of predecessors of each exit is > 1000. It increased the compilation time of IVUser analysis significantly. This patch adds a limit for the iterations to check whether a loop has only dedicated exits. With the patch, the time to compile our testcase reduced from 1000s to 200s (clang release build).

Diff Detail

Repository: rL LLVM

Event Timeline

wmi created this revision.Sep 9 2019, 9:37 AM

Herald added a project: Restricted Project. · View Herald TranscriptSep 9 2019, 9:37 AM

I wonder whether the problem is in this function or in analysis itself?
If it is a problem of analysis it is better to restrict it but not an utility function.

In D67359#1664241, @skatkov wrote:

I wonder whether the problem is in this function or in analysis itself?
If it is a problem of analysis it is better to restrict it but not an utility function.

I feel the problem is in the utility function. Currently the problem is exposed in IVUser pass but the same can happen anywhere else if only the function is called for a huge loop with many exits and with each exit having many predecessors.

In D67359#1664242, @wmi wrote:

In D67359#1664241, @skatkov wrote:

I wonder whether the problem is in this function or in analysis itself?
If it is a problem of analysis it is better to restrict it but not an utility function.

I feel the problem is in the utility function. Currently the problem is exposed in IVUser pass but the same can happen anywhere else if only the function is called for a huge loop with many exits and with each exit having many predecessors.

Can you do some measurements to ensure that the issue is really in this utility function but not in the user of it?

At the same time it makes sense to update function description to specify that with your change the function can conservatively returns no if it could not get the precise answer.

Is there any chance you could provide the test case that shows the behavior? There might be something we could do in hasDedicatedExits to make it faster for large loops with a large number of exit blocks and successors.

In D67359#1664243, @skatkov wrote:

In D67359#1664242, @wmi wrote:

In D67359#1664241, @skatkov wrote:

I wonder whether the problem is in this function or in analysis itself?
If it is a problem of analysis it is better to restrict it but not an utility function.

I feel the problem is in the utility function. Currently the problem is exposed in IVUser pass but the same can happen anywhere else if only the function is called for a huge loop with many exits and with each exit having many predecessors.

Can you do some measurements to ensure that the issue is really in this utility function but not in the user of it?

I did that. For the huge loop, every call of hasDedicatedExits took about 3~5 seconds. hasDedicatedExits is called by isSimplifiedLoopNest which is used by IVUser pass. Indeed there is another problem that IVUser pass calls isSimplifiedLoopNest for the same loop multiple times (~200). However, IVUser is a loop analysis pass, and isSimplifiedLoopNest can be called for a loop different from the current loop analyzed. So isSimplifiedLoopNest is called for the same huge loop many times for many different invocations of IVUser pass analyzing different loops, and the cache mechanism local to IVUser doesn't work for it.

I understand it is possible to mitigate the problem in IVUser (I don't know how to do that yet. The original commit to add isSimplifiedLoopNest was r152892 in 2012 and the original bug was in rdar://problem/11049788 which I cannot read) but it seems also reasonable to put a limit in hasDedicatedExits -- For my case, the loop with 4000 exits + each exit has 1000 predecessors took 3~5 seconds for each call of hasDedicatedExits. It is already noticeable to compiler users. If the loop is larger, it will take more time even if hasDedicatedExits is called only once for it. I think it is better to put a limit for such high cost function and that is independent to improve the cache or algorithm in IVUser.

At the same time it makes sense to update function description to specify that with your change the function can conservatively returns no if it could not get the precise answer.

That makes sense. I will do that.

In D67359#1664487, @fhahn wrote:

Is there any chance you could provide the test case that shows the behavior? There might be something we could do in hasDedicatedExits to make it faster for large loops with a large number of exit blocks and successors.

The loop contains a huge switch case statement. I will see if I can create something showing similar behavior.

Sorry for taking a while to get back. Upload a testfile 1.cc:

1.cc497 KBDownload

~/workarea/llvm-r371746/rbuild/bin/clang -O2 -S 1.cc
Without the patch, it took 52s to compile.
With the patch, it took 2s to compile.
gcc took 9s to compile.

And for the testcase, disabling LSR didn't help much, so it is not an IVUser specific problem. It took so long mainly because of hasDedicatedExits.

Maybe you can add this test case to llvm test suite?

In D67359#1679956, @xbolva00 wrote:

Maybe you can add this test case to llvm test suite?

Do you know how to add a compile time testcase in testsuite usually?

Thanks,
Wei.

Probably just put it here:

https://github.com/llvm/llvm-test-suite/tree/master/SingleSource/Regression/C

In D67359#1679979, @xbolva00 wrote:

Probably just put it here:

https://github.com/llvm/llvm-test-suite/tree/master/SingleSource/Regression/C

Thanks, I can put the test there, but I am not sure how large should I increase the size of the test to so it will definitely timeout without the patch, and do I need to consider the test takes different time to compile on different machines? I remember there is a fixed compile time limit to build a file in testsuite but I don't remember how long it is. Do you happen to know that?

Address Serguei's comment: update function description.

In D67359#1679990, @wmi wrote:

In D67359#1679979, @xbolva00 wrote:

Probably just put it here:

https://github.com/llvm/llvm-test-suite/tree/master/SingleSource/Regression/C

Thanks, I can put the test there, but I am not sure how large should I increase the size of the test to so it will definitely timeout without the patch, and do I need to consider the test takes different time to compile on different machines? I remember there is a fixed compile time limit to build a file in testsuite but I don't remember how long it is. Do you happen to know that?

I dont know :( but 2s is probably okay. Maybe @fhahn can you help here..

skatkov added inline comments.Sep 23 2019, 8:14 PM

include/llvm/Analysis/LoopInfoImpl.h
88	I'm sorry if I was not specific but I meant the comment in LoopInfo.h file for the method hasDedicatedExits not in LoopInfoImpl.h.

Address Serguei's comment.

wmi marked an inline comment as done.Sep 23 2019, 10:37 PM

wmi added inline comments.

include/llvm/Analysis/LoopInfoImpl.h
88	Ah, got it, thanks. Fixed.

In D67359#1680004, @xbolva00 wrote:

In D67359#1679990, @wmi wrote:

In D67359#1679979, @xbolva00 wrote:

Probably just put it here:

https://github.com/llvm/llvm-test-suite/tree/master/SingleSource/Regression/C

Thanks, I can put the test there, but I am not sure how large should I increase the size of the test to so it will definitely timeout without the patch, and do I need to consider the test takes different time to compile on different machines? I remember there is a fixed compile time limit to build a file in testsuite but I don't remember how long it is. Do you happen to know that?

I dont know :( but 2s is probably okay. Maybe @fhahn can you help here..

I find the compile time limit in lnt nightly test is 500 seconds. To stably get time out without the patch, I estimate I need to increase the size of test so it takes > 30seconds to compile with the patch. I am not sure if that is ok.

The last thing. Would not you like to add a test examine the max-dedicate-exit-iterations functionality?

The test can use the small number for max-dedicate-exit-iterations and ensure that the function returns false? If it is difficult examine through opt you can probably write a unit test?
It seems it may protect you from future modification of this method.

If it is difficult or impossible to do -> lgtm.

In D67359#1681798, @skatkov wrote:

The last thing. Would not you like to add a test examine the max-dedicate-exit-iterations functionality?

Good point. Test added.

Address Serguei's comment: add a unittest.

Herald added a subscriber: asbirlea. · View Herald TranscriptSep 25 2019, 8:38 AM

Test added.

Great.

Thanks.

This revision is now accepted and ready to land.Sep 25 2019, 7:43 PM

Thanks all for the review!

Closed by commit rL372990: [LoopInfo] Limit the iterations to check whether a loop has dedicated exits (authored by wmi). · Explain WhySep 26 2019, 8:34 AM

This revision was automatically updated to reflect the committed changes.

This is an archive of the discontinued LLVM Phabricator instance.

[IVUser] Limit the iterations to check whether a loop has dedicated exits for extreme large case
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 221437

include/llvm/Analysis/LoopInfoImpl.h

lib/Analysis/LoopInfo.cpp

This is an archive of the discontinued LLVM Phabricator instance.

[IVUser] Limit the iterations to check whether a loop has dedicated exits for extreme large caseClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 221437

include/llvm/Analysis/LoopInfoImpl.h

lib/Analysis/LoopInfo.cpp

[IVUser] Limit the iterations to check whether a loop has dedicated exits for extreme large case
ClosedPublic