Page MenuHomePhabricator

Limit number of blocks scanned for non-local dependences

Authored by joerg on Jan 12 2016, 12:26 PM.



When MemoryDependenceAnalysis hits a CFG with many transparent blocks, the algorithm easily degrades into quadratic memory and time complexity. The easiest example is a long chain of BBs that don't otherwise use a location. The patch introduces a limit similar to the existing instructions-per-block limit, counting the total number of blocks checked. If the limit it reached, entries are considered unknown.

I'm not entirely sure how to best test this. The test cases for the original bug are huge, when they should trigger on a moderately large machine.

Diff Detail


Event Timeline

joerg updated this revision to Diff 44660.Jan 12 2016, 12:26 PM
joerg retitled this revision from to Limit number of blocks scanned for non-local dependences.
joerg updated this object.
joerg added a reviewer: reames.
joerg added a subscriber: llvm-commits.
bruno added a subscriber: bruno.Jan 20 2016, 10:33 AM

Hi Joerg,

Thanks for working on this!

Given your testcase (original bug), could you show how much time it takes for different BlockNumberLimit values? Although a 100 seems reasonable at first look, I wonder how much we can push the limit for the default value.

joerg added a comment.Jan 24 2016, 2:40 PM

Copying comment that didn't make into phab:

Run time in steps of 100, absolute value not necessarily stable:
2.5s 3.0s 3.3s 4.1s 4.1s 4.3s 4.8s 5.1s 5.4s 5.6s

It's not exactly surprising that the initial ramp up is mostly linear. I
simply don't know what the implications for runtime are at this point,
can't easily run LNT.

reames edited edge metadata.Jan 25 2016, 1:44 PM

I'm really not sure this patch is going in the right direction. In generally, we try to avoid arbitrary thresholds. Is there an algorithmic fix which can address the compile time issue?

Can you give a bit more information on the IR characteristics which are causing this?

joerg added a comment.Feb 11 2016, 3:29 PM

Ping? As discussed in the associated PR, the algorithmic issues are pretty much unavoidable due to the nature of the analysis. As this fixes a real issue for the 3.8 release, I'd still like to see this go in. For trunk, the work on MemSSA likely makes this whole code obsolete with NewGVN.

This revision was automatically updated to reflect the committed changes.