Change store merge candidates check cut off to 1024.
The previous value of 8192 resulted in 5x compile time hits in some pathological cases.
Differential D46581
[DAGCombine] Change store merge candidates check cut off to 1024 aemerson on May 8 2018, 7:35 AM. Authored by
Details Change store merge candidates check cut off to 1024. The previous value of 8192 resulted in 5x compile time hits in some pathological cases.
Diff Detail
Event TimelineComment Actions I ran the test suite and SPEC2000/2006 benchmarks and didn't see any significant change with this. Comment Actions I think pruning this back is reasonable. The choice of 8192 was arbitrary. Have you checked larger sizes than 1024? Given it's a 5x increase (presumably) of the total runtime and the algorithm is O(N) I would expect to only need a 5x reduction and 2048 would be roughly equivalent. Also, since you've run the spec benchmarks, can do a quick check to see which binaries changed? This search could be rewritten to bound this search to the true common ancestor but there's no point in rewriting it unless there's a valid merging case where we give up early. Either way this LGTM. Comment Actions The actual problem here seems to be superlinear performance with this cutoff value. The 5x I mentioned was inappropriate as that was over a whole compile and compared to an old version of LLVM, before the store merging after legalisation patch for AArch64 was landed in December. The actual problem in my test case, which is admittedly a very large one, is that with the original value of 8192, I get around 330s runtime for my test case, at Max=2048 its 189s, Max=1024 its 34s, Max=512 21s, Max=256 20s. So somewhere between 512 and 1024 we start to see this superlinear compile time. Unfortunately I can't share the test case, but something is definitely not O(N). Comment Actions
Fair enough. If you happen into a sharable test case that I could use as a benchmark I'd be happy to see if I could prune this check more; It seems like there's still a lot of time being consumed here. |