This is an archive of the discontinued LLVM Phabricator instance.

[SCEV] Limit max size of AddRecExpr during evolving
ClosedPublic

Authored by mkazantsev on Jul 20 2017, 1:02 AM.

Details

Summary

When SCEV calculates product of two SCEVAddRecs from the same loop, it
tries to combine them into one big AddRecExpr. If the sizes of the initial
SCEVs were S1 and S2, the size of their product is S1 + S2 - 1, and every
operand of the resulting SCEV is combined from operands of initial SCEV and
has much higher complexity than they have.

As result, if we try to calculate something like:

%x1 = {a,+,b}
%x2 = mul i32 %x1, %x1
%x3 = mul i32 %x2, %x1
%x4 = mul i32 %x3, %x2
...

The size of such SCEVs grows as 2^N, and the arguments
become more and more complex as we go forth. This leads
to long compilation and huge memory consumption.

This patch sets a limit after which we don't try to combine two
SCEVAddRecExprs into one. By default, max allowed size of the
resulting AddRecExpr is set to 16.

Diff Detail

Repository
rL LLVM

Event Timeline

mkazantsev created this revision.Jul 20 2017, 1:02 AM
sanjoy accepted this revision.Jul 21 2017, 10:40 AM

Hi,

This change itself looks fine to me. However, I'm getting a bit worried about the arbitrariness of thresholds we have now. Can you figure out some way to get the histograms of these limits seen on a run (and perhaps run it over some benchmarks you have)? That is, what are the values of AddRec->getNumOperands() + OtherAddRec->getNumOperands() - 1 we actually see in a run? We could record and report this dependent on a flag.

This revision is now accepted and ready to land.Jul 21 2017, 10:40 AM

Hi Sanjoy,

Investigating the impact of thresholds may be an interesting idea indeed. But it is not clear where should we measure them. This patch came from evaluation of bugs 33753 and 33494 (and actually is a fix for the 1st one and some subset of tests of the latter). The problem is that the impact of these parameters can differ dramatically on real workloads and fuzzed tests of special forms. For example, some of the tests from bug 33494 pass with only limit for AddRecs set, and others also need reduction of Arith limit to pass in reasonable time. The values of those limits are chosen absolutely randomly, though, and it is not clear what we should use as a reference when chosing their values.

I will make these experiments with test I have, but I can imagine that we won't see significant difference in real code.

Hi Sanjoy,

Investigating the impact of thresholds may be an interesting idea indeed. But it is not clear where should we measure them. This patch came from evaluation of bugs 33753 and 33494 (and actually is a fix for the 1st one and some subset of tests of the latter). The problem is that the impact of these parameters can differ dramatically on real workloads and fuzzed tests of special forms. For example, some of the tests from bug 33494 pass with only limit for AddRecs set, and others also need reduction of Arith limit to pass in reasonable time. The values of those limits are chosen absolutely randomly, though, and it is not clear what we should use as a reference when chosing their values.

This is what I mean -- let's say in "real" workloads (for some fuzzy definition of "real"), AddRec->getNumOperands() + OtherAddRec->getNumOperands() - 1 is 18 roughly 30% of the time, and is never more than 20. In such a universe, we've picked a too low value for MaxAddRecSize -- we should have set it to 20. In other words, I'm interested in the code-quality impact of these thresholds on real world code. Since "real world code" means different things to different folks, having a somewhat generic method of deriving these limits will be useful.

Hi Sanjoy,

Investigating the impact of thresholds may be an interesting idea indeed. But it is not clear where should we measure them. This patch came from evaluation of bugs 33753 and 33494 (and actually is a fix for the 1st one and some subset of tests of the latter). The problem is that the impact of these parameters can differ dramatically on real workloads and fuzzed tests of special forms. For example, some of the tests from bug 33494 pass with only limit for AddRecs set, and others also need reduction of Arith limit to pass in reasonable time. The values of those limits are chosen absolutely randomly, though, and it is not clear what we should use as a reference when chosing their values.

This is what I mean -- let's say in "real" workloads (for some fuzzy definition of "real"), AddRec->getNumOperands() + OtherAddRec->getNumOperands() - 1 is 18 roughly 30% of the time, and is never more than 20. In such a universe, we've picked a too low value for MaxAddRecSize -- we should have set it to 20. In other words, I'm interested in the code-quality impact of these thresholds on real world code. Since "real world code" means different things to different folks, having a somewhat generic method of deriving these limits will be useful.

I've filed the bug https://bugs.llvm.org/show_bug.cgi?id=33901 to keep track on this task. I think some other limits, such as arithmetic depth limit, also may be re-adjusted.

This revision was automatically updated to reflect the committed changes.