Page MenuHomePhabricator

[LoopUnswitch] Perform loop unswitching on select instructions
Needs RevisionPublic

Authored by kachkov98 on Dec 1 2022, 6:29 AM.



Try to convert select instruction with loop-invariant condition into
control-flow equivalent (if-then branch and phi node) and unswitch
this branch. This unswitching is always non-trivial since the successors
of created branch are not loop exits, so it requires cloning of loop.

Diff Detail

Unit TestsFailed

230 msx64 debian > LLVM.Transforms/SimpleLoopUnswitch::nontrivial-unswitch.ll
Script: -- : 'RUN: at line 1'; /var/lib/buildkite-agent/builds/llvm-project/build/bin/opt -passes='loop(simple-loop-unswitch<nontrivial>),verify<loops>' -S < /var/lib/buildkite-agent/builds/llvm-project/llvm/test/Transforms/SimpleLoopUnswitch/nontrivial-unswitch.ll | /var/lib/buildkite-agent/builds/llvm-project/build/bin/FileCheck /var/lib/buildkite-agent/builds/llvm-project/llvm/test/Transforms/SimpleLoopUnswitch/nontrivial-unswitch.ll
6,100 msx64 debian > libFuzzer.libFuzzer::fuzzer-leak.test
Script: -- : 'RUN: at line 3'; /var/lib/buildkite-agent/builds/llvm-project/build/./bin/clang --driver-mode=g++ -O2 -gline-tables-only -fsanitize=address,fuzzer -I/var/lib/buildkite-agent/builds/llvm-project/compiler-rt/lib/fuzzer -m64 /var/lib/buildkite-agent/builds/llvm-project/compiler-rt/test/fuzzer/LeakTest.cpp -o /var/lib/buildkite-agent/builds/llvm-project/build/projects/compiler-rt/test/fuzzer/X86_64DefaultLinuxConfig/Output/fuzzer-leak.test.tmp-LeakTest

Event Timeline

kachkov98 created this revision.Dec 1 2022, 6:29 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 1 2022, 6:29 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
kachkov98 requested review of this revision.Dec 1 2022, 6:29 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 1 2022, 6:29 AM
kachkov98 added a comment.EditedDec 1 2022, 6:30 AM

test-suite statistics:

Program                                                                             simple-loop-unswitch.NumSelects
           test-suite :: External/SPEC/CFP2017rate/526.blender_r/526.blender_r.test  52.00                         
                 test-suite :: External/SPEC/CINT2017speed/602.gcc_s/602.gcc_s.test  16.00                         
                  test-suite :: External/SPEC/CINT2017rate/502.gcc_r/502.gcc_r.test  16.00                         
                        test-suite :: MultiSource/Applications/ClamAV/clamscan.test   7.00                         
                      test-suite :: MultiSource/Benchmarks/7zip/7zip-benchmark.test   6.00                         
                       test-suite :: MultiSource/Applications/JM/ldecod/ldecod.test   6.00                         
test-suite :: MultiSource/Benchmarks/MiBench/consumer-typeset/consumer-typeset.test   4.00                         
                   test-suite :: External/SPEC/CINT2017speed/657.xz_s/657.xz_s.test   3.00                         
                          test-suite :: MultiSource/Applications/oggenc/oggenc.test   3.00                         
      test-suite :: External/SPEC/CINT2017rate/500.perlbench_r/500.perlbench_r.test   3.00                         
     test-suite :: External/SPEC/CINT2017speed/600.perlbench_s/600.perlbench_s.test   3.00                         
                    test-suite :: External/SPEC/CINT2017rate/557.xz_r/557.xz_r.test   3.00                         
             test-suite :: External/SPEC/CFP2017rate/511.povray_r/511.povray_r.test   3.00                         
                       test-suite :: MultiSource/Applications/JM/lencod/lencod.test   2.00                         
            test-suite :: MultiSource/Benchmarks/mediabench/jpeg/jpeg-6a/cjpeg.test   2.00                         
      test-suite :: MultiSource/Benchmarks/MiBench/consumer-jpeg/consumer-jpeg.test   2.00                         
      test-suite :: External/SPEC/CINT2017rate/523.xalancbmk_r/523.xalancbmk_r.test   2.00                         
     test-suite :: External/SPEC/CINT2017speed/623.xalancbmk_s/623.xalancbmk_s.test   2.00                         
                            test-suite :: MultiSource/Benchmarks/Bullet/bullet.test   2.00                         
                            test-suite :: MultiSource/Applications/spiff/spiff.test   2.00                         
                            test-suite :: MultiSource/Benchmarks/nbench/nbench.test   1.00                         
        test-suite :: MultiSource/Benchmarks/mediabench/g721/g721encode/encode.test   1.00                         
          test-suite :: MultiSource/Benchmarks/FreeBench/pcompress2/pcompress2.test   1.00                         
                        test-suite :: MultiSource/Applications/sqlite3/sqlite3.test   1.00                         
test-suite :: MultiSource/Benchmarks/MiBench/automotive-susan/automotive-susan.test   1.00                         
          test-suite :: External/SPEC/CFP2017speed/638.imagick_s/638.imagick_s.test   1.00                         
                  test-suite :: MultiSource/Benchmarks/ASC_Sequoia/AMGmk/AMGmk.test   1.00                         
      test-suite :: MultiSource/Benchmarks/MiBench/consumer-lame/consumer-lame.test   1.00                         
                     test-suite :: MultiSource/Benchmarks/mafft/pairlocalalign.test   1.00                         
      test-suite :: MultiSource/Benchmarks/MiBench/office-ispell/office-ispell.test   1.00                         
           test-suite :: External/SPEC/CFP2017rate/538.imagick_r/538.imagick_r.test   1.00                         
             test-suite :: External/SPEC/CFP2017rate/510.parest_r/510.parest_r.test   1.00                         
                                  test-suite :: MultiSource/Benchmarks/sim/sim.test   1.00

I don't observe any measurable execution time change on SPEC2017 with this change, but in some cases unswitching of select looks desirable (like

mkazantsev added inline comments.Dec 4 2022, 11:22 PM

It is not an equivalent transform in sense of UB. If the condition was posion, then select by this condition is also poison. If you turn it into a branch, branch by poisoned condition is immediate UB. How do you account for that?

kachkov98 updated this revision to Diff 480082.Dec 5 2022, 5:54 AM

Insert freeze of select condition

kachkov98 added inline comments.Dec 5 2022, 5:59 AM

Accidentionally it was correct in some tests, because unswitchNontrivialInvariants() inserts freeze for branch conditions if the branch is not guaranteed to execute at least once in loop body; for other loops it will not work, so now we insert freeze of select condition explicitly.

mkazantsev requested changes to this revision.Dec 6 2022, 11:47 PM
mkazantsev added inline comments.

Can we change the insertion point to right after the cond? Presumeably it's outside of loop.


This is wrong, you can't insert a Phi before SI if there is another instruction before it. Your insertion point should be smth like block.begin().


This condition is growing weird, can we rewrite it to make it explicit that we choose between guard, select, terminator etc? Maybe factor out into a lambda.


Just for record: this will cost us compile time, turning the algorithm from O(number of blocks) to O(number of instructions) (if guards are off). Keep that in mind if you see compile time regressions.


I would also filter out such obviously useless cases as:

  • Select is not used
  • Select is only used outside of the loop (and therefore can be sunk)

maybe smth else, give it some more thought.


To see what your tests are doing, I propose the following steps:

  • Regenerate all checks in this file using script;
  • Check in your new tests with checks as is (without your change in code)
  • Then rebase your patch on top of it, to see what exactly your patch has changed.

Without this, it's hard to tell what's right or wrong here.

This revision now requires changes to proceed.Dec 6 2022, 11:47 PM
mkazantsev added inline comments.Dec 6 2022, 11:48 PM

I also propose to give this Phi a name.

fhahn added a subscriber: caojoshua.Dec 7 2022, 7:10 AM

It looks like this tries to address the same issue as D138526. Might be goo to sync up with @caojoshua

It looks like this tries to address the same issue as D138526. Might be goo to sync up with @caojoshua

Can confirm this is addressing the same issue. Our implementations are quite a bit different. It seems you are converting select's into branch's and mostly using existing logic. In my review, I try to directly unswitch the select. Not sure which approach is preferred. I'll let people who have been working on LLVM longer give their thoughts.

Either way, I think we can share some test cases. In my review, I pulled in select tests from the old PM LoopUnswitch pass, and added some checks to existing tests.