This is an archive of the discontinued LLVM Phabricator instance.

[SLP]Fix alternate cmp operands analysis.
AbandonedPublic

Authored by ABataev on Aug 25 2022, 12:22 PM.

Details

Reviewers
RKSimon
vdmitrie
Summary

Missed negation on the results for areCompatibleCmpOps for alternate cmp
operations, it may lead to non-optimal operands ordering.

Metric: SLP.NumVectorInstructions

 Program                                                                                       SLP.NumVectorInstructions
                                                                                               results                   results0 diff
                                test-suite :: External/SPEC/CFP2006/447.dealII/447.dealII.test  5109.00                   5107.00 -0.0%
                                test-suite :: External/SPEC/CFP2006/453.povray/453.povray.test  5514.00                   5509.00 -0.1%
                        test-suite :: External/SPEC/CFP2017rate/511.povray_r/511.povray_r.test  5505.00                   5500.00 -0.1%
                      test-suite :: External/SPEC/CFP2017rate/526.blender_r/526.blender_r.test 27293.00                  27193.00 -0.4%
                         test-suite :: External/SPEC/CINT2017rate/541.leela_r/541.leela_r.test   252.00                    251.00 -0.4%
                        test-suite :: External/SPEC/CINT2017speed/641.leela_s/641.leela_s.test   252.00                    251.00 -0.4%
                     test-suite :: MultiSource/Benchmarks/DOE-ProxyApps-C++/miniFE/miniFE.test   400.00                    396.00 -1.0%

447.dealII - extra <4x> vectorization
453.povray - same
511.povray_r - same, better shuffles
526.blender_r - > 20 new <4x> vectorization, less shuffles
541.leela_r, 641.leela_s, miniFE - pretty the same, better shuffles

Diff Detail

Event Timeline

ABataev created this revision.Aug 25 2022, 12:22 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 25 2022, 12:22 PM
ABataev requested review of this revision.Aug 25 2022, 12:22 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 25 2022, 12:22 PM
RKSimon added inline comments.Aug 30 2022, 8:48 AM
llvm/test/Transforms/SLPVectorizer/X86/reduction-logical.ll
231–237

The "<0,1,2,3,1,0,2,3>" swizzle in the upper half is a regression - I noticed it on a couple of other test changes as well.

ABataev updated this revision to Diff 457366.Sep 1 2022, 12:53 PM

Address comments

ABataev edited the summary of this revision. (Show Details)Sep 1 2022, 12:54 PM
ABataev edited the summary of this revision. (Show Details)Sep 1 2022, 12:55 PM
vdmitrie added inline comments.Sep 7 2022, 9:38 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
3669

Is it possible to split off the reordering improvement into a separate patch?

5883

The main issue with existing code here which makes me nervous is that main/alternate logic is not consistent across SLP. getSameOpcode has some clear enough approach but its logic does not match one here [and another similar code].
Unfortunately this patch does not fix that. That is maintainability issue too.
For isAlternateInstruction, for example, there are couple of check-points: it shall answer true if I == AltOp and false when I == MainOp. But it is very difficult to figure from this code whether it will work this way.
Without clear description it is pretty difficult to reverse engineer the intent.
I put a draft patch here https://reviews.llvm.org/D133430 - as a possible approach to the problem.
It works very similar to this patch (not taking into account reordering improvement) but has consistent approach to main/alternate selection. There were couple of minor differences when an instruction could fit both main and alternate flows.

ABataev added inline comments.Sep 7 2022, 10:43 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
3669

Sure

ABataev abandoned this revision.Sep 13 2022, 9:33 AM

Abandoned in favor of D133430