The patch bound number of reassociations in LSR solution search.
Every time number of reassociations exceed 16^x we decrease lookup Depth by x.
Before the patch only lookup Depth was bounded.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
Could you please explain what is the problem? I cannot make it out from the test, it's too big. Can it be reduced?
Problem is in the test compile time. You can download it and try.
The reason of the compile time hang is too many reasociations in LSR.
Now reassociations lookup is bounded only by Depth. If there are reasonable amount of reassociations on each level ~16,
 the whole number would not exceed ~16^3 which is ok.
However if number of reassociations itself is very big (~256 in this test) it could grow to ~256^3 which is unacceptable.
In the patch I'm trying to introduce some universal bound which will take in account both depth and number of reassociations and not hurt current performance s generally number of reassociations is <16.
Now reassociations lookup is bounded only by Depth. If there are reasonable amount of reassociations on each level ~16, the whole number would not exceed ~16^3 which is ok.
Did you intend to limit the number of reassociations to 16 at each level ?
| lib/Transforms/Scalar/LoopStrengthReduce.cpp | ||
|---|---|---|
| 3605–3606 ↗ | (On Diff #143852) | IIUC, this change perform like if there are many reassociations at one level, then we limit the depth in the next levels. right? If we want to guarantee certain number of reassociations at each level, we may need to limit the number of function call here at each level. I'm not sure which one is beneficial for performance in general between limiting the depth or limiting a certain number of reassociations at each level with a fixed depth. | 
| lib/Transforms/Scalar/LoopStrengthReduce.cpp | ||
|---|---|---|
| 3605–3606 ↗ | (On Diff #143852) | Yes. However the change do not limit linear case (1st level). If we want to limit 1st level, we'll need to decide which operand we use and which not, making algorithm more complicated with questionable compile time gain. Doing nothing in case we exceed some limit can hurt performance. I think that for linear cases we should leave number of reassociations as is. If you have example that shows we should limit this as well, please share. | 
| lib/Transforms/Scalar/LoopStrengthReduce.cpp | ||
|---|---|---|
| 3605–3606 ↗ | (On Diff #143852) | Thanks for the explanation. I agree. We may not want to simply limit considering AddOps in each level without knowing compile time gain. I don't have any specific example, but I also wanted to see any regression case caused by limiting the depth. If this function handle linear cases in general, it might be reasonable. | 
PING. The change does not change code in spec2000/spec2006. However, it prevents compile time hangs in some corner cases.
Thanks for clarification. LGTM as long as it will not introduce performance regressions. I think we can go with it and just rever it in case of some problems.