This is an archive of the discontinued LLVM Phabricator instance.

[DA] control compile-time spent by MIV tests
ClosedPublic

Authored by bmahjour on Jul 30 2021, 7:10 AM.

Details

Summary

Function exploreDirections() in DependenceAnalysis implements a recursive algorithm for refining direction vectors. This algorithm has worst-case complexity of O(3^(n+1)) where n is the number of common loop levels. It essentially performs a search over a complete ternary tree of dependence directions. The algorithm tries to prune the search space by checking testBounds before recursing for each direction, but since the delta usually falls within the sum of lower and upper bounds the search space ends up being the full tree most of the time.

The problem can be illustrated with a deeply nested loop. For exampley DA takes more than a minute to analyze a 12 level deep loop nest containing a single load and a store (see example below and the attached IR) on a 3.1 GHz P8 machine with 600 GB of RAM.

for (int i1 = 0; i1 < n; i1++)
  for (int i2 = 0; i2 < n; i2++)
    ...
       for (int i12 = 0; i12 < n; i12++)
           A[i1 + i2 + ... + i12] = B[i1 + i2 ... + i12];

The compile-time increases exponentially as the depth of the loop nest increases.

In this patch I'm adding a threshold to control the amount of time we spend in doing MIV tests (which most of the time end up resulting in over pessimistic direction vectors anyway).

Diff Detail

Event Timeline

bmahjour created this revision.Jul 30 2021, 7:10 AM
bmahjour requested review of this revision.Jul 30 2021, 7:10 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 30 2021, 7:10 AM

Is it feasible tp add a test-case that reaches the threshold? If a loop depth of 12 takes a minute, would a test with depth 10 still be acceptable to run as part of the regression tests?

Is it feasible tp add a test-case that reaches the threshold? If a loop depth of 12 takes a minute, would a test with depth 10 still be acceptable to run as part of the regression tests?

Not sure if it would be good to add a potentially long-running test (like a 10-level deep nest) to check-llvm. I tried this on a pretty powerful server...smaller systems or those under heavy load will likely take longer. However if you are interested in making sure that the DA results are still sensible when the threshold is reached, I could have a smaller, say 3-level loop nest and analyze it with da-miv-max-level-threshold=2. Would that be good?

Is it feasible tp add a test-case that reaches the threshold? If a loop depth of 12 takes a minute, would a test with depth 10 still be acceptable to run as part of the regression tests?

Not sure if it would be good to add a potentially long-running test (like a 10-level deep nest) to check-llvm. I tried this on a pretty powerful server...smaller systems or those under heavy load will likely take longer. However if you are interested in making sure that the DA results are still sensible when the threshold is reached, I could have a smaller, say 3-level loop nest and analyze it with da-miv-max-level-threshold=2. Would that be good?

Yes, that would be great! SCEV uses this approach as well.

bmahjour updated this revision to Diff 364222.Aug 4 2021, 12:45 PM

Added a test and lowered the default threshold to 7.

Meinersbur accepted this revision.Aug 4 2021, 1:44 PM

LGTM, thank you.

This revision is now accepted and ready to land.Aug 4 2021, 1:44 PM
This revision was automatically updated to reflect the committed changes.