Page MenuHomePhabricator

[SLP]Do not reduce repeated values, use scalar red ops instead.
Needs ReviewPublic

Authored by ABataev on Aug 19 2022, 1:58 PM.

Details

Reviewers
RKSimon
vdmitrie
Summary

Metric: size..text

size..text                 results     results0    diff

SingleSource/Regression/C/gcc-c-torture/execute/GCC-C-execute-980605-1.test 445.00 461.00 3.6%
SingleSource/Benchmarks/Adobe-C++/loop_unroll.test 428477.00 428445.00 -0.0%
External/SPEC/CFP2006/447.dealII/447.dealII.test 618849.00 618785.00 -0.0%

For all tests some extra code was optimized, GCC-C-execute has some more
inlining after

Diff Detail

Unit TestsFailed

TimeTest
60,060 msx64 debian > MLIR.Examples/standalone::test.toy
Script: -- : 'RUN: at line 1'; /usr/bin/cmake /var/lib/buildkite-agent/builds/llvm-project/mlir/examples/standalone -G "Ninja" -DCMAKE_CXX_COMPILER=/usr/bin/clang++ -DCMAKE_C_COMPILER=/usr/bin/clang -DLLVM_ENABLE_LIBCXX=OFF -DMLIR_DIR=/var/lib/buildkite-agent/builds/llvm-project/build/lib/cmake/mlir -DLLVM_USE_LINKER=lld

Event Timeline

ABataev created this revision.Aug 19 2022, 1:58 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 19 2022, 1:58 PM
ABataev requested review of this revision.Aug 19 2022, 1:58 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 19 2022, 1:58 PM
ABataev updated this revision to Diff 461984.Sep 21 2022, 12:33 PM

Rebase, ping

The patch is seems trying to move SLP vectorizer into InstCombine territory. I'm not sure why we should do that. Did you try to analyze which specific patterns helped in for example GCC-C-execute test?
Can these represent any cases where instcombine could be improved instead? It is difficult to access this patch based on test changes as a lot of SLP vectorizer tests were not designed as capable to run through instcombine.
If you run "-instcombine -slp-vectorizer" instead of just -slp-vectorizer then how many of the affected LIT tests would still benefit from the patch?

The patch is seems trying to move SLP vectorizer into InstCombine territory. I'm not sure why we should do that. Did you try to analyze which specific patterns helped in for example GCC-C-execute test?
Can these represent any cases where instcombine could be improved instead? It is difficult to access this patch based on test changes as a lot of SLP vectorizer tests were not designed as capable to run through instcombine.
If you run "-instcombine -slp-vectorizer" instead of just -slp-vectorizer then how many of the affected LIT tests would still benefit from the patch?

Most of these test will be optimized for sure, since they are pretty simple. But looks like there are some other places, where reduction analysis in SLP is better than the similar analysis in instcombiner. Also, if we can do some optimization here, it shall reduce compile time, since instcombiner consumes lots of time

Now the difference is even more:

            test-suite :: SingleSource/Benchmarks/Adobe-C++/loop_unroll.test   431225.00   431288.00  0.0%
       test-suite :: External/SPEC/CFP2017rate/510.parest_r/510.parest_r.test  2198241.00  2198465.00  0.0%
                      test-suite :: MultiSource/Applications/SPASS/SPASS.test   530608.00   530640.00  0.0%
        test-suite :: External/SPEC/CINT2006/400.perlbench/400.perlbench.test  1135953.00  1135937.00 -0.0%
               test-suite :: External/SPEC/CFP2006/447.dealII/447.dealII.test   651440.00   651360.00 -0.0%
test-suite :: MultiSource/Benchmarks/DOE-ProxyApps-C/SimpleMOC/SimpleMOC.test    48631.00    48311.00 -0.7%

InstCombine can simplify most of the currently not optimized stuff, but requires extra passes (like ReassociatePass) and extra compile time, while we can do it easily in SLP vectorizer, since we already have all required data. PhaseOrdering tests are the proves.

A couple of general thoughts. Can you please add a knob that allows to turn off the optimization? And can some sort of debug tracing be added? Such as values that has been optimized away?

ABataev updated this revision to Diff 489611.Mon, Jan 16, 11:25 AM

Address comments

ABataev updated this revision to Diff 490845.Fri, Jan 20, 7:25 AM

Rebase, ping.

I'll try to look at this closely next week (but not earlier than Monday). Just a quick note about terminology used. The option name "slp-same-scalars-reduction" does not actually tell much about what it actually controls.
Since you basically are trying to optimize away identity operations in reduction sequences I'd suggest you to use name for option and across the code that better reflect that.
The option name could be "-slp-optimize-identity-hor-reduction-ops=true|false" for example.

ABataev updated this revision to Diff 490963.Fri, Jan 20, 1:35 PM

Rebase, renamed option, added a test run for false option value

ABataev updated this revision to Diff 492126.Wed, Jan 25, 8:24 AM

Rebase, ping!