This is an archive of the discontinued LLVM Phabricator instance.

[InstCombiner] Add option to replace PHI of GEPs with GEP with PHI as index
Needs ReviewPublic

Authored by eklepilkina on Jun 14 2022, 1:29 AM.

Details

Summary

Creating PHI node which contains elements from same entity got with GEP instructions doesn't allow later sink these GEP instructions on CodeGenPrepare phase.
This prevents generating offset addressing on such targets as RISC-V.

This patch tries to modify IR in order to allow sinking geps and generating addressing with offsets for arrays.
If we get the next IR

array_id1 = gep array, <const>
array_id2 = gep array, <const1>
... 
phi_node = phi(array, arra_id1, array_id2, ...)
store <value>, phi_node

Clang doesn't analyze that all instances in phi_node have the same base, so during code generation we don't need extra register and we should reuse base+offset.

This can speed up some benchmarks, so added this behaviour under the flag.

Diff Detail

Event Timeline

eklepilkina created this revision.Jun 14 2022, 1:29 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 14 2022, 1:29 AM
eklepilkina requested review of this revision.Jun 14 2022, 1:29 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 14 2022, 1:29 AM
eklepilkina retitled this revision from [test][SimplifyCFG] Precommit test with GEP instructions to [SimplifyCFG] Don't sink common code if PHI node with some GEPs is created.Jun 14 2022, 1:31 AM
eklepilkina edited the summary of this revision. (Show Details)
eklepilkina added a reviewer: anton-afanasyev.

Please rebase against precommited tests.

Rebased against precommited tests

KYG added a subscriber: KYG.Jun 22 2022, 2:32 AM
  • [SimplifyCFG] Added flag to control creation of PHI nodes with entity elements
  • [SimplifyCFG] Added flag to control creation of PHI nodes with entity elements
eklepilkina edited the summary of this revision. (Show Details)Jul 3 2022, 9:59 PM
nikic requested changes to this revision.Jul 4 2022, 2:26 AM
nikic added a subscriber: nikic.

I don't think this hack is acceptable. The actual problem here seems to be that the phi (gep(p, c1), gep(p, c2)) doesn't get converted into gep p, phi(c1, c2) afterwards, which is what would give you the best of both worlds: The store is sunk, but we still make use of the common base.

The reason why this doesn't happen is once again our dumb structural GEP representation. In this case, the difference is in a struct index, and struct indices cannot be variable, so we can't have a gep of phi over a struct index. If this weren't a struct GEP, then everything would fold as expected.

The probably easiest way to fix this is to canonicalize constant GEPs to byte GEPs in InstCombine.

This revision now requires changes to proceed.Jul 4 2022, 2:26 AM

[InstCombiner] Add option to replace PHI of GEPs with GEP with PHI as index

eklepilkina retitled this revision from [SimplifyCFG] Don't sink common code if PHI node with some GEPs is created to [InstCombiner] Add option to replace PHI of GEPs with GEP with PHI as index.Jul 14 2022, 5:53 AM
eklepilkina edited the summary of this revision. (Show Details)

@nikic, thank you! I've rewritten this modification in InstCombine.

eklepilkina added a comment.EditedJul 14 2022, 6:07 AM

I added this under the flag and run subset of test-suite with flag turned on.
The comparasion results are in cycles (got by perf).

Program                                       lhs             rhs             diff 
test-suite...ks/McCat/12-IOtest/iotest.test     1730119179.50   1759987511.50  1.7%
test-suite...dbl/InductionVariable-dbl.test    15970628662.00  16143688715.00  1.1%
test-suite...ks/Shootout/Shootout-ary3.test     6325347117.00   6381912099.50  0.9%
test-suite...nchmarks/McCat/05-eks/eks.test        8518913.50      8587491.50  0.8%
test-suite...ootout/Shootout-ackermann.test       25282176.00     25469480.50  0.7%
test-suite...arks/Misc-C++/mandel-text.test        4674518.00      4703841.50  0.6%
test-suite...hmarks/Linpack/linpack-pc.test    11993126592.50  12064251950.50  0.6%
test-suite...hmarks/McCat/08-main/main.test      225665552.50    226796119.00  0.5%
test-suite...nchmarks/llubenchmark/llu.test    49595221348.00  49825997987.50  0.5%
test-suite...ks/BenchmarkGame/fannkuch.test     8840456858.50   8875706667.00  0.4%
test-suite...enchmarks/Misc-C++/bigfib.test     1115784166.50   1119810490.00  0.4%
test-suite...s/Shootout/Shootout-lists.test    15792799815.50  15841314384.00  0.3%
test-suite...hootout/Shootout-methcall.test    23138087710.50  23207934639.00  0.3%
test-suite...enchmarks/SmallPT/smallpt.test   133647137171.50 134007183990.00  0.3%
test-suite...pansion-dbl/Expansion-dbl.test    15082523126.50  15122058550.00  0.3%
test-suite...ks/VersaBench/8b10b/8b10b.test    16297287940.00  16334546875.50  0.2%
test-suite...g/correlation/correlation.test    22344211730.00  22376027642.50  0.1%
test-suite...dbl/LoopRestructuring-dbl.test    17866275862.50  17891513886.50  0.1%
test-suite...tions/lambda-0.1.3/lambda.test    23848609431.50  23874580616.00  0.1%
test-suite...isc-C++/Large/sphereflake.test    22534218048.50  22557448226.50  0.1%
test-suite...arks/CoyoteBench/fftbench.test     9755232988.50   9764752974.50  0.1%
test-suite...BenchmarkGame/Large/fasta.test     2953324792.00   2956162851.00  0.1%
test-suite...hootout/Shootout-heapsort.test    12178551710.50  12189772510.50  0.1%
test-suite.../VersaBench/ecbdes/ecbdes.test     7658433342.00   7665010262.00  0.1%
test-suite...lications/minisat/minisat.test    36458868838.50  36487032574.50  0.1%
test-suite...nch/beamformer/beamformer.test     3386405080.50   3388803280.00  0.1%
test-suite...s/BenchmarkGame/recursive.test     4357781157.50   4360511842.50  0.1%
test-suite...sc-C++/stepanov_container.test    19735457132.50  19744002826.00  0.0%
test-suite...ing/covariance/covariance.test    22339342859.00  22345554589.50  0.0%
test-suite...nchmarks/McCat/18-imp/imp.test      220190325.50    220233858.50  0.0%
test-suite...nchmarkGame/spectral-norm.test     5131851853.00   5132763450.50  0.0%
test-suite...ications/JM/lencod/lencod.test    22163302318.00  22164857640.00  0.0%
test-suite.../Shootout/Shootout-random.test     6800321466.00   6800648301.50  0.0%
test-suite...C/Packing-flt/Packing-flt.test    11853900866.00  11854075069.50  0.0%
test-suite...arks/Misc-C++/oopack_v1p8.test      469347892.00    469345852.00 -0.0%
test-suite.../Shootout/Shootout-matrix.test     7736484059.50   7736206171.50 -0.0%
test-suite...arks/BenchmarkGame/n-body.test     5077979989.50   5077524708.00 -0.0%
test-suite...ks/Shootout/Shootout-fib2.test     6044285917.00   6042939836.50 -0.0%
test-suite...rks/CoyoteBench/almabench.test    55903132772.00  55888456245.00 -0.0%
test-suite...arks/VersaBench/dbms/dbms.test     6377529281.00   6375371712.00 -0.0%
test-suite...lFlow-flt/ControlFlow-flt.test    14248096032.50  14243188640.00 -0.0%
test-suite...BenchmarkGame/partialsums.test     1751834064.50   1751013687.00 -0.0%
test-suite...ks/Shootout/Shootout-hash.test    26538589089.00  26525947808.00 -0.0%
test-suite...ks/McCat/04-bisect/bisect.test      461377892.00    461131908.00 -0.1%
test-suite...rks/CoyoteBench/huffbench.test    51898090546.50  51863510434.00 -0.1%
test-suite...ks/Misc-C++/stepanov_v1p2.test    19604519098.50  19590038581.00 -0.1%
test-suite...s/Shootout/Shootout-sieve.test    17504360694.50  17489338906.00 -0.1%
test-suite...gebra/kernels/syr2k/syr2k.test    19044493844.00  19023576343.00 -0.1%
test-suite...marks/CoyoteBench/lpbench.test    31592682170.00  31555540368.00 -0.1%
test-suite...lications/SIBsim4/SIBsim4.test    12830348755.00  12814944423.50 -0.1%
test-suite...hmarks/VersaBench/bmm/bmm.test    15498981844.00  15471594390.00 -0.2%
test-suite...ications/JM/ldecod/ldecod.test      200649404.50    200267732.50 -0.2%
test-suite...hmarks/Misc-C++/Large/ray.test    11946912378.00  11921434993.50 -0.2%
test-suite...ing-flt/LoopRerolling-flt.test    16000457270.00  15951286477.00 -0.3%
test-suite...hmarks/Misc-C++-EH/spirit.test    51411075104.00  51251308313.00 -0.3%
test-suite...arks/mafft/pairlocalalign.test    70582165091.00  70319704425.50 -0.4%
test-suite...ks/McCat/01-qbsort/qbsort.test      329657452.50    328273253.00 -0.4%
test-suite.../Shootout/Shootout-strcat.test      440747280.00    438762150.00 -0.5%
test-suite...s/Shootout/Shootout-hello.test        1553176.00      1545444.50 -0.5%
test-suite...BenchmarkGame/nsieve-bits.test     3450081201.50   3431868282.50 -0.5%
test-suite...arks/BenchmarkGame/puzzle.test      401683862.50    399282456.00 -0.6%
test-suite...nchmarks/McCat/09-vor/vor.test      391180151.50    388582881.50 -0.7%
test-suite...algebra/kernels/symm/symm.test   147266145123.00 146250484895.00 -0.7%
test-suite.../Applications/spiff/spiff.test    10344289458.00  10271890620.00 -0.7%
test-suite...cCat/03-testtrie/testtrie.test       40115596.50     39727934.00 -1.0%
test-suite...lications/sqlite3/sqlite3.test    14597138412.50  14408790445.50 -1.3%
test-suite...hmarks/McCat/15-trie/trie.test        4096874.00      4027291.50 -1.7%
test-suite...arks/McCat/17-bintr/bintr.test      304726204.50    297527095.00 -2.4%
test-suite...Shootout/Shootout-objinst.test        1592863.50      1541648.50 -3.2%
test-suite...otout/Shootout-nestedloop.test        1656967.50      1538899.00 -7.1%
Geomean difference                                                            -0.2%
                lhs           rhs       diff
count  7.000000e+01  7.000000e+01  70.000000
mean   1.673824e+10  1.672873e+10 -0.001837 
std    2.611553e+10  2.605807e+10  0.010844 
min    1.553176e+06  1.538899e+06 -0.071256 
25%    6.309570e+08  6.319620e+08 -0.002075 
50%    1.004976e+10  1.001832e+10 -0.000020 
75%    1.946451e+10  1.944842e+10  0.001332 
max    1.472661e+11  1.462505e+11  0.017264

Runs were made on Alibaba T-Head RVB-ICE development board featuring a RISC-V dual-core 1.2GHz XuanTie C910 ICE SoC with a Vivante 3D GPU, an NPU, 4GB RAM

I didn't get any real regressions. The first 2 regressions (iotest.test and InductionVariable-dbl.test) aren't real regressions, IR isn't changed by InstCombine and assembly is the same.

At the same time we got improvement of Coremark with this patch.

nikic added a comment.Jul 14 2022, 6:08 AM

We already have an implementation of this general transform -- shouldn't it be sufficient to relax only this condition? https://github.com/llvm/llvm-project/blob/7dc18a62e40e241019ec77e70f01bc41d39ab748/llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp#L537 (At least for the basic case.)

We already have an implementation of this general transform -- shouldn't it be sufficient to relax only this condition? https://github.com/llvm/llvm-project/blob/7dc18a62e40e241019ec77e70f01bc41d39ab748/llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp#L537 (At least for the basic case.)

Yes, I saw this code and make several experiments with it. But decided to add extra method. Let me try to describe my motivation.

  1. Orginal fold methods use the approach that they check the first PHI operand. And work with cases when there are only GEPSs in PHI node. We need the case when there can be both GEP instructions and just array.
  2. My experiments show that really in many coditions such modification makes worse (as mentioned in comment for condition you have provided a link for). If you looked at code we have added several conditions that are connected with uses of his PHI node and number of operands. As far I had to add all these not very obvious heuristics in this modification I decided to do it as separate method. Moreover, then it is easier to turn on/off it with flag. If integrate flag and all needed heuristics inside exsiting method, it becomes not so readable.

We already have an implementation of this general transform -- shouldn't it be sufficient to relax only this condition? https://github.com/llvm/llvm-project/blob/7dc18a62e40e241019ec77e70f01bc41d39ab748/llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp#L537 (At least for the basic case.)

Yes, I saw this code and make several experiments with it. But decided to add extra method. Let me try to describe my motivation.

  1. Orginal fold methods use the approach that they check the first PHI operand. And work with cases when there are only GEPSs in PHI node. We need the case when there can be both GEP instructions and just array.
  2. My experiments show that really in many coditions such modification makes worse (as mentioned in comment for condition you have provided a link for). If you looked at code we have added several conditions that are connected with uses of his PHI node and number of operands. As far I had to add all these not very obvious heuristics in this modification I decided to do it as separate method. Moreover, then it is easier to turn on/off it with flag. If integrate flag and all needed heuristics inside exsiting method, it becomes not so readable.

I haven't looked at the patch in detail, but the comments suggest this is not moving in the right direction. There should not be a debug flag to enable this transform unless there are planned follow-ups that would allow removing the flag soon after the initial commit. If the transform needs heuristics to be profitable, then it's probably not a good match for instcombine (canonicalization). The backend or later passes would have to be able to invert the transform to avoid regressions.

nikic added a comment.Jul 14 2022, 1:39 PM

We already have an implementation of this general transform -- shouldn't it be sufficient to relax only this condition? https://github.com/llvm/llvm-project/blob/7dc18a62e40e241019ec77e70f01bc41d39ab748/llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp#L537 (At least for the basic case.)

Yes, I saw this code and make several experiments with it. But decided to add extra method. Let me try to describe my motivation.

  1. Orginal fold methods use the approach that they check the first PHI operand. And work with cases when there are only GEPSs in PHI node. We need the case when there can be both GEP instructions and just array.
  2. My experiments show that really in many coditions such modification makes worse (as mentioned in comment for condition you have provided a link for). If you looked at code we have added several conditions that are connected with uses of his PHI node and number of operands. As far I had to add all these not very obvious heuristics in this modification I decided to do it as separate method. Moreover, then it is easier to turn on/off it with flag. If integrate flag and all needed heuristics inside exsiting method, it becomes not so readable.

I haven't looked at the patch in detail, but the comments suggest this is not moving in the right direction. There should not be a debug flag to enable this transform unless there are planned follow-ups that would allow removing the flag soon after the initial commit. If the transform needs heuristics to be profitable, then it's probably not a good match for instcombine (canonicalization). The backend or later passes would have to be able to invert the transform to avoid regressions.

I believe we should be able to do this as a canonicalization (at least dropping the constant bailout for the existing transform -- not sure about the case where not all inputs are GEPs) and undo in the backend -- this is the usual problem that in IR we want to form constant phis, while the backend prefers pushing operations into phis, because it often avoids constant materialization "for free". There have been multiple attempts at that, the latest being D119916. I still think the right approach to this problem is a late backend IR pass.

  • [InstCombiner] Removed flag to turn off the new type of canonization
  • [InstCombiner][Test] Removed flag from test

Ok, I got your concerns about flag, I've checked the test-suite results also on X86 (they are attached below). There was no real regressions. Microbenchmarks results that are in top of regressions aren't reproducable and on other runs I've made they have another result. Of course, I have no opportunity to check all backends.

The second important question which was raised is connected with the first element of array. This modification works only when PHI contains GEPs and/or array that is used in these GEPs (please, have a look on negative tests). It seems that it's quite common case, there isn't always generated GEP for the first element of array. And I don't see the reason why we should ignore the first element. Of course, it's possible to generate the GEP instruction for the first element, but I amn't sure that this is worth to do.

The results for X86

Tests: 2900
Short Running: 2325 (filtered out)
Remaining: 575
Metric: exec_time

Program                                                                                                                                          results1  results   diff  
                                           test-suite :: MicroBenchmarks/LCALS/SubsetALambdaLoops/lcalsALambda.test:BM_PRESSURE_CALC_LAMBDA/5001      6.17      7.46  20.9%
                                                                     test-suite :: SingleSource/Benchmarks/Shootout-C++/Shootout-C++-matrix.test      0.86      0.88   2.8%
                   test-suite :: MicroBenchmarks/SLPVectorization/SLPVectorizationBenchmarks.test:benchmark_add_xor_runtime_checks_fail<16, int>      5.75      5.88   2.3%
                       test-suite :: MicroBenchmarks/SLPVectorization/SLPVectorizationBenchmarks.test:benchmark_xor_runtime_checks_pass<16, int>      3.88      3.96   2.0%
                                          test-suite :: MicroBenchmarks/LCALS/SubsetALambdaLoops/lcalsALambda.test:BM_PRESSURE_CALC_LAMBDA/44217     57.17     58.26   1.9%
                                                                                 test-suite :: MultiSource/Applications/lambda-0.1.3/lambda.test      1.60      1.62   1.4%
                                                                                   test-suite :: MultiSource/Applications/hexxagon/hexxagon.test      3.66      3.71   1.3%
                                               test-suite :: MicroBenchmarks/LCALS/SubsetCLambdaLoops/lcalsCLambda.test:BM_HYDRO_1D_LAMBDA/44217     13.91     14.02   0.8%
                                                    test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<7, LessThanZero, Mid>    467.97    471.40   0.7%
                                                                                  test-suite :: SingleSource/UnitTests/Vector/Vector-build2.test      1.27      1.28   0.6%
                                              test-suite :: MicroBenchmarks/LCALS/SubsetBLambdaLoops/lcalsBLambda.test:BM_MULADDSUB_LAMBDA/44217     77.56     77.94   0.5%
                       test-suite :: MicroBenchmarks/SLPVectorization/SLPVectorizationBenchmarks.test:benchmark_xor_runtime_checks_fail<16, int>      3.89      3.89   0.2%
                   test-suite :: MicroBenchmarks/SLPVectorization/SLPVectorizationBenchmarks.test:benchmark_add_xor_runtime_checks_pass<16, int>      5.90      5.91   0.1%
                 test-suite :: MicroBenchmarks/ImageProcessing/AnisotropicDiffusion/AnisotropicDiffusion.test:BENCHMARK_ANISTROPIC_DIFFUSION/256  20371.53  20291.87  -0.4%
                                    test-suite :: MicroBenchmarks/LoopVectorization/LoopVectorizationBenchmarks.test:BENCHMARK_expf_novec_float_     68.57     68.18  -0.6%
                                                test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<7, GreaterThanZero, None>    592.11    587.85  -0.7%
                                                   test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<4, LessThanZero, Last>    702.31    696.73  -0.8%
                                  test-suite :: MicroBenchmarks/LoopVectorization/LoopVectorizationBenchmarks.test:BENCHMARK_expf_autovec_float_     68.52     67.88  -0.9%
                                                         test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<5, EqZero, None>    588.38    582.80  -0.9%
                                                                                  test-suite :: SingleSource/UnitTests/Vectorizer/gcc-loops.test      0.96      0.95  -1.1%
                                                                               test-suite :: SingleSource/Benchmarks/Shootout/Shootout-fib2.test      0.90      0.89  -1.2%
                                                                       test-suite :: SingleSource/Benchmarks/Shootout-C++/Shootout-C++-fibo.test      0.96      0.94  -1.2%
                   test-suite :: MicroBenchmarks/LoopVectorization/LoopVectorizationBenchmarks.test:benchVecWithRuntimeChecks4PointersDAfterA/32   6690.78   6597.45  -1.4%
                                                                                     test-suite :: MultiSource/Applications/viterbi/viterbi.test      0.63      0.62  -1.5%
                                                         test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<5, EqZero, Last>    588.72    579.63  -1.5%
                                                                                          test-suite :: SingleSource/Benchmarks/Misc/perlin.test      0.90      0.89  -1.6%
                                                        test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<5, EqZero, First>    586.99    577.70  -1.6%
                                               test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<7, GreaterThanZero, First>    528.17    519.69  -1.6%
                                                 test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<4, GreaterThanZero, Mid>    807.39    793.24  -1.8%
                                                          test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<7, EqZero, Mid>    417.44    409.77  -1.8%
                                                test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<4, GreaterThanZero, Last>    807.75    792.70  -1.9%
                                                   test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<5, LessThanZero, None>    572.04    561.36  -1.9%
                                                                           test-suite :: SingleSource/Benchmarks/Shootout/Shootout-methcall.test      1.85      1.82  -1.9%
                                                   test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<5, LessThanZero, Last>    571.49    560.19  -2.0%
                                                   test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<7, LessThanZero, Last>    593.47    581.55  -2.0%
                                                  test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<7, LessThanZero, First>    467.26    457.60  -2.1%
                                               test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<4, GreaterThanZero, First>    810.57    793.80  -2.1%
                                                                      test-suite :: MultiSource/Benchmarks/TSVC/Expansion-flt/Expansion-flt.test      0.79      0.77  -2.1%
                                                    test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<5, LessThanZero, Mid>    655.05    640.83  -2.2%
                                                   test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<3, LessThanZero, None>    948.93    927.79  -2.2%
                                                   test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<7, LessThanZero, None>    526.75    514.99  -2.2%
                                                        test-suite :: MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt.test      0.99      0.97  -2.2%
                                                  test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<15, LessThanZero, Last>    352.40    344.50  -2.2%
                                                  test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<3, LessThanZero, First>   1090.29   1065.75  -2.3%
                                                       test-suite :: MicroBenchmarks/LCALS/SubsetCRawLoops/lcalsCRaw.test:BM_PLANCKIAN_RAW/44217    210.42    205.68  -2.3%
                                                                                             test-suite :: MultiSource/Applications/aha/aha.test      0.75      0.73  -2.4%
                                                         test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<3, EqZero, Last>    973.73    950.52  -2.4%
                                    test-suite :: MicroBenchmarks/LoopVectorization/LoopVectorizationBenchmarks.test:BENCHMARK_exp_novec_double_     87.47     85.38  -2.4%
                                                    test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<6, LessThanZero, Mid>    545.02    531.56  -2.5%
                           test-suite :: MicroBenchmarks/ImageProcessing/BilateralFiltering/BilateralFilter.test:BENCHMARK_BILATERAL_FILTER/32/4    169.26    165.07  -2.5%
                                                test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<5, GreaterThanZero, None>    650.03    633.92  -2.5%
                                                          test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<3, EqZero, Mid>    974.41    949.97  -2.5%
                                                         test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<8, EqZero, None>    275.72    268.80  -2.5%
                                               test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<63, GreaterThanZero, None>    130.61    127.26  -2.6%
                                               test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<6, GreaterThanZero, First>    622.44    606.38  -2.6%
                                                test-suite :: MicroBenchmarks/LCALS/SubsetCLambdaLoops/lcalsCLambda.test:BM_HYDRO_1D_LAMBDA/5001      1.53      1.49  -2.6%
                                                         test-suite :: MicroBenchmarks/LCALS/SubsetCRawLoops/lcalsCRaw.test:BM_MAT_X_MAT_RAW/171     67.25     65.49  -2.6%
                                                                                   test-suite :: Bitcode/Benchmarks/Halide/blur/halide_blur.test      1.44      1.41  -2.6%
                                                        test-suite :: MicroBenchmarks/LCALS/SubsetCRawLoops/lcalsCRaw.test:BM_PLANCKIAN_RAW/5001     23.59     22.96  -2.7%
                                                test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<5, GreaterThanZero, Last>    655.52    637.89  -2.7%
                  test-suite :: MicroBenchmarks/SLPVectorization/SLPVectorizationBenchmarks.test:benchmark_xor_no_runtime_checks_needed<16, int>      1.73      1.69  -2.7%
                  test-suite :: MicroBenchmarks/LoopVectorization/LoopVectorizationBenchmarks.test:benchVecWithRuntimeChecks4PointersDEqualsA/32  16691.33  16236.89  -2.7%
                                                    test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<3, LessThanZero, Mid>   1093.55   1063.75  -2.7%
                  test-suite :: MicroBenchmarks/ImageProcessing/AnisotropicDiffusion/AnisotropicDiffusion.test:BENCHMARK_ANISTROPIC_DIFFUSION/64   1056.63   1027.50  -2.8%
                                                         test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<7, EqZero, Last>    421.45    409.78  -2.8%
                                                         test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<3, EqZero, None>    975.60    948.42  -2.8%
                                                        test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<3, EqZero, First>    977.50    949.62  -2.9%
                                 test-suite :: MicroBenchmarks/LoopVectorization/LoopVectorizationBenchmarks.test:BENCHMARK_acosf_autovec_float_    104.11    101.14  -2.9%
                                                            test-suite :: MicroBenchmarks/LCALS/SubsetCRawLoops/lcalsCRaw.test:BM_PIC_2D_RAW/171      0.63      0.61  -2.9%
                 test-suite :: MicroBenchmarks/LoopVectorization/LoopVectorizationBenchmarks.test:benchVecWithRuntimeChecks4PointersDAfterA/1000 146882.35 142632.06  -2.9%
                                                  test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<5, LessThanZero, First>    660.29    640.99  -2.9%
                                                test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<2, GreaterThanZero, Last>   1626.70   1578.43  -3.0%
                                                        test-suite :: MultiSource/Benchmarks/TSVC/LinearDependence-dbl/LinearDependence-dbl.test      1.30      1.26  -3.0%
                                                                               test-suite :: SingleSource/Benchmarks/BenchmarkGame/fannkuch.test      1.64      1.59  -3.0%
     test-suite :: MicroBenchmarks/SLPVectorization/SLPVectorizationBenchmarks.test:benchmark_multiply_accumulate_runtime_checks_pass<2, double>      2.04      1.97  -3.1%
                                                    test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<4, LessThanZero, Mid>    706.56    684.67  -3.1%
                           test-suite :: MicroBenchmarks/ImageProcessing/BilateralFiltering/BilateralFilter.test:BENCHMARK_BILATERAL_FILTER/16/4     31.07     30.10  -3.1%
                 test-suite :: MicroBenchmarks/ImageProcessing/AnisotropicDiffusion/AnisotropicDiffusion.test:BENCHMARK_ANISTROPIC_DIFFUSION/128   4828.75   4676.94  -3.1%
                                                test-suite :: MicroBenchmarks/LCALS/SubsetBLambdaLoops/lcalsBLambda.test:BM_IF_QUAD_LAMBDA/44217    125.83    121.86  -3.2%
     test-suite :: MicroBenchmarks/LoopVectorization/LoopVectorizationBenchmarks.test:benchVecWithRuntimeChecks4PointersAllDisjointDecreasing/32   5473.88   5300.91  -3.2%
                                                                               test-suite :: MultiSource/Benchmarks/ASC_Sequoia/IRSmk/IRSmk.test      1.29      1.25  -3.2%
                           test-suite :: MicroBenchmarks/ImageProcessing/BilateralFiltering/BilateralFilter.test:BENCHMARK_BILATERAL_FILTER/32/2     48.85     47.30  -3.2%
                    test-suite :: MicroBenchmarks/SLPVectorization/SLPVectorizationBenchmarks.test:benchmark_add_xor_runtime_checks_pass<4, int>      1.89      1.83  -3.2%
                                                                   test-suite :: SingleSource/Benchmarks/Shootout-C++/Shootout-C++-methcall.test      2.65      2.56  -3.2%
                                                              test-suite :: MultiSource/Benchmarks/TSVC/NodeSplitting-dbl/NodeSplitting-dbl.test      1.83      1.77  -3.3%
                                                      test-suite :: MicroBenchmarks/LCALS/SubsetCLambdaLoops/lcalsCLambda.test:BM_ADI_LAMBDA/171      2.02      1.96  -3.3%
                           test-suite :: MicroBenchmarks/ImageProcessing/BilateralFiltering/BilateralFilter.test:BENCHMARK_BILATERAL_FILTER/16/2     10.55     10.20  -3.3%
                                                                          test-suite :: MultiSource/Benchmarks/TSVC/Packing-flt/Packing-flt.test      1.42      1.38  -3.4%
                           test-suite :: MicroBenchmarks/ImageProcessing/BilateralFiltering/BilateralFilter.test:BENCHMARK_BILATERAL_FILTER/64/4    782.73    756.31  -3.4%
                                                         test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<7, EqZero, None>    423.18    408.68  -3.4%
                                                    test-suite :: MultiSource/Benchmarks/TSVC/CrossingThresholds-dbl/CrossingThresholds-dbl.test      1.47      1.42  -3.4%
                                               test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<5, GreaterThanZero, First>    752.46    726.54  -3.4%
                           test-suite :: MicroBenchmarks/ImageProcessing/BilateralFiltering/BilateralFilter.test:BENCHMARK_BILATERAL_FILTER/64/2    208.01    200.82  -3.5%
                test-suite :: MicroBenchmarks/LoopVectorization/LoopVectorizationBenchmarks.test:benchVecWithRuntimeChecks4PointersDEqualsA/1000 437381.77 422205.70  -3.5%
                                                                                       test-suite :: MultiSource/Benchmarks/Olden/em3d/em3d.test      0.98      0.94  -3.5%
                                                  test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<6, LessThanZero, First>    549.77    530.58  -3.5%
                                             test-suite :: MicroBenchmarks/LCALS/SubsetALambdaLoops/lcalsALambda.test:BM_ENERGY_CALC_LAMBDA/5001     24.96     24.09  -3.5%
                                               test-suite :: MicroBenchmarks/LCALS/SubsetCLambdaLoops/lcalsCLambda.test:BM_PLANCKIAN_LAMBDA/5001     23.86     23.02  -3.5%
                                               test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<3, GreaterThanZero, First>   1252.87   1208.58  -3.5%
                                                        test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<64, EqZero, None>    127.30    122.73  -3.6%
                                                 test-suite :: MicroBenchmarks/LCALS/SubsetALambdaLoops/lcalsALambda.test:BM_VOL3D_CALC_LAMBDA/2      1.93      1.86  -3.6%
                                              test-suite :: MicroBenchmarks/LCALS/SubsetCLambdaLoops/lcalsCLambda.test:BM_PLANCKIAN_LAMBDA/44217    212.25    204.57  -3.6%
   test-suite :: MicroBenchmarks/LoopVectorization/LoopVectorizationBenchmarks.test:benchVecWithRuntimeChecks4PointersAllDisjointDecreasing/1000 140639.22 135534.19  -3.6%
   test-suite :: MicroBenchmarks/LoopVectorization/LoopVectorizationBenchmarks.test:benchVecWithRuntimeChecks4PointersAllDisjointIncreasing/1000 180969.23 174395.46  -3.6%
                                                test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<6, GreaterThanZero, Last>    866.07    834.53  -3.6%
                                                test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<7, GreaterThanZero, Last>    668.36    643.91  -3.7%
                  test-suite :: MicroBenchmarks/LoopVectorization/LoopVectorizationBenchmarks.test:benchVecWithRuntimeChecks4PointersDBeforeA/32  19009.93  18313.17  -3.7%
                  test-suite :: MicroBenchmarks/ImageProcessing/AnisotropicDiffusion/AnisotropicDiffusion.test:BENCHMARK_ANISTROPIC_DIFFUSION/32    215.27    207.33  -3.7%
                                                test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<15, GreaterThanZero, Mid>    284.14    273.49  -3.7%
                                                test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<3, GreaterThanZero, None>   1099.25   1057.99  -3.8%
                                                 test-suite :: MicroBenchmarks/LCALS/SubsetALambdaLoops/lcalsALambda.test:BM_VOL3D_CALC_LAMBDA/1     51.69     49.74  -3.8%
              test-suite :: MicroBenchmarks/SLPVectorization/SLPVectorizationBenchmarks.test:benchmark_add_xor_no_runtime_checks_needed<16, int>      1.97      1.90  -3.8%
                                                    test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<8, LessThanZero, Mid>    414.06    398.27  -3.8%
                                                              test-suite :: MultiSource/Benchmarks/TSVC/LoopRerolling-dbl/LoopRerolling-dbl.test      1.98      1.91  -3.8%
                                  test-suite :: MicroBenchmarks/LoopVectorization/LoopVectorizationBenchmarks.test:BENCHMARK_exp_autovec_double_     88.11     84.70  -3.9%
                                                   test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<2, LessThanZero, None>   1408.04   1353.64  -3.9%
                        test-suite :: MicroBenchmarks/SLPVectorization/SLPVectorizationBenchmarks.test:benchmark_xor_runtime_checks_fail<4, int>      1.75      1.68  -3.9%
                                                         test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<15, EqZero, Mid>    208.10    199.89  -3.9%
     test-suite :: MicroBenchmarks/LoopVectorization/LoopVectorizationBenchmarks.test:benchVecWithRuntimeChecks4PointersAllDisjointIncreasing/32   5495.25   5276.79  -4.0%
                                                          test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<5, EqZero, Mid>    601.37    577.09  -4.0%
                                                 test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<6, GreaterThanZero, Mid>    630.07    604.58  -4.0%
                   test-suite :: MicroBenchmarks/SLPVectorization/SLPVectorizationBenchmarks.test:benchmark_xor_no_runtime_checks_needed<4, int>      1.80      1.72  -4.0%
                                                              test-suite :: MicroBenchmarks/LCALS/SubsetARawLoops/lcalsARaw.test:BM_COUPLE_RAW/1    125.13    120.02  -4.1%
                                                   test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<2, LessThanZero, Last>   1419.79   1361.73  -4.1%
                                                       test-suite :: MicroBenchmarks/LCALS/SubsetCRawLoops/lcalsCRaw.test:BM_FIRST_DIFF_RAW/5001      1.24      1.19  -4.1%
                                                                                         test-suite :: MultiSource/Benchmarks/Bullet/bullet.test      1.95      1.87  -4.2%
                                                                 test-suite :: MultiSource/Benchmarks/DOE-ProxyApps-C/Pathfinder/PathFinder.test      1.30      1.24  -4.2%
                    test-suite :: MicroBenchmarks/SLPVectorization/SLPVectorizationBenchmarks.test:benchmark_add_xor_runtime_checks_fail<4, int>      1.91      1.83  -4.2%
                                                                      test-suite :: MultiSource/Benchmarks/TSVC/Expansion-dbl/Expansion-dbl.test      1.30      1.25  -4.2%
                                                  test-suite :: MicroBenchmarks/LCALS/SubsetCRawLoops/lcalsCRaw.test:BM_FIND_FIRST_MIN_RAW/44217     16.99     16.27  -4.2%
                                                 test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<3, GreaterThanZero, Mid>   1262.37   1208.63  -4.3%
     test-suite :: MicroBenchmarks/SLPVectorization/SLPVectorizationBenchmarks.test:benchmark_multiply_accumulate_runtime_checks_pass<3, double>      2.14      2.05  -4.3%
                                                  test-suite :: MultiSource/Benchmarks/TSVC/StatementReordering-dbl/StatementReordering-dbl.test      2.01      1.93  -4.3%
                                                test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<6, GreaterThanZero, None>    781.89    747.85  -4.4%
                                                           test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/symm/symm.test      3.87      3.70  -4.4%
                                                                      test-suite :: MultiSource/Benchmarks/TSVC/Searching-dbl/Searching-dbl.test      1.74      1.67  -4.4%
                                                         test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<2, EqZero, Last>   1106.36   1058.08  -4.4%
                                                                                   test-suite :: MultiSource/Benchmarks/SciMark2-C/scimark2.test     20.27     19.39  -4.4%
                                                          test-suite :: MicroBenchmarks/LCALS/SubsetARawLoops/lcalsARaw.test:BM_VOL3D_CALC_RAW/1     52.19     49.88  -4.4%
                                                          test-suite :: MicroBenchmarks/LCALS/SubsetARawLoops/lcalsARaw.test:BM_VOL3D_CALC_RAW/0    238.52    227.97  -4.4%
                                                   test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<3, LessThanZero, Last>    971.53    928.49  -4.4%
                                                                                      test-suite :: SingleSource/Benchmarks/SmallPT/smallpt.test      3.24      3.09  -4.4%
                                                 test-suite :: MicroBenchmarks/LCALS/SubsetCLambdaLoops/lcalsCLambda.test:BM_DISC_ORD_LAMBDA/171      3.08      2.94  -4.5%
                                                                         test-suite :: MultiSource/Benchmarks/DOE-ProxyApps-C++/CLAMR/CLAMR.test      0.68      0.65  -4.5%
                                                                              test-suite :: SingleSource/Benchmarks/BenchmarkGame/recursive.test      0.78      0.75  -4.5%
                                                         test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<2, EqZero, None>   1105.38   1055.15  -4.5%
                                                                                     test-suite :: MultiSource/Applications/SIBsim4/SIBsim4.test      1.40      1.34  -4.6%
                                                                             test-suite :: SingleSource/Benchmarks/Shootout/Shootout-random.test      1.44      1.37  -4.6%
                                                                  test-suite :: MultiSource/Benchmarks/TSVC/ControlFlow-dbl/ControlFlow-dbl.test      1.63      1.55  -4.6%
                                              test-suite :: MicroBenchmarks/LCALS/SubsetCLambdaLoops/lcalsCLambda.test:BM_FIRST_DIFF_LAMBDA/5001      1.13      1.08  -4.6%
                                                         test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<4, EqZero, None>    561.75    536.00  -4.6%
                                                test-suite :: MicroBenchmarks/LCALS/SubsetCLambdaLoops/lcalsCLambda.test:BM_PLANCKIAN_LAMBDA/171      0.82      0.78  -4.6%
                                                                                  test-suite :: MultiSource/Benchmarks/mafft/pairlocalalign.test      9.88      9.43  -4.6%
                                  test-suite :: MicroBenchmarks/LoopVectorization/LoopVectorizationBenchmarks.test:BENCHMARK_cosf_autovec_float_    127.80    121.91  -4.6%
                                              test-suite :: MicroBenchmarks/LCALS/SubsetALambdaLoops/lcalsALambda.test:BM_ENERGY_CALC_LAMBDA/171      0.86      0.82  -4.6%
                                                                                      test-suite :: SingleSource/Benchmarks/Misc/fp-convert.test      1.03      0.98  -4.6%
                                                                                         test-suite :: SingleSource/Benchmarks/Misc/flops-8.test      0.67      0.64  -4.7%
                                                                      test-suite :: MultiSource/Benchmarks/TSVC/Symbolics-flt/Symbolics-flt.test      0.67      0.64  -4.7%
                                                                                     test-suite :: SingleSource/Benchmarks/Misc/ReedSolomon.test      2.29      2.18  -4.7%
                                         test-suite :: MicroBenchmarks/LCALS/SubsetCLambdaLoops/lcalsCLambda.test:BM_FIND_FIRST_MIN_LAMBDA/44217     17.08     16.27  -4.7%
                                                         test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<6, EqZero, None>    508.69    484.56  -4.7%
                                                         test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<6, EqZero, Last>    501.49    477.55  -4.8%
                                                    test-suite :: MultiSource/Benchmarks/TSVC/IndirectAddressing-dbl/IndirectAddressing-dbl.test      1.89      1.80  -4.8%
                                                           test-suite :: MicroBenchmarks/ImageProcessing/Dilate/Dilate.test:BENCHMARK_DILATE/128     29.33     27.93  -4.8%
                                                                       test-suite :: SingleSource/Benchmarks/Adobe-C++/stepanov_abstraction.test      2.91      2.77  -4.8%
                                                                                               test-suite :: MultiSource/Benchmarks/sim/sim.test      1.77      1.68  -4.8%
                                                                               test-suite :: SingleSource/Benchmarks/Misc-C++/stepanov_v1p2.test      5.67      5.39  -4.9%
                                                test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<3, GreaterThanZero, Last>   1115.48   1061.15  -4.9%
                                                      test-suite :: MultiSource/Benchmarks/TSVC/InductionVariable-flt/InductionVariable-flt.test      1.73      1.64  -4.9%
                                                   test-suite :: MicroBenchmarks/LCALS/SubsetBLambdaLoops/lcalsBLambda.test:BM_INIT3_LAMBDA/5001      7.33      6.97  -4.9%
                                               test-suite :: MicroBenchmarks/LCALS/SubsetBLambdaLoops/lcalsBLambda.test:BM_TRAP_INT_LAMBDA/44217    104.49     99.35  -4.9%
                                                    test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<2, LessThanZero, Mid>   1430.28   1359.75  -4.9%
                                                 test-suite :: MicroBenchmarks/LCALS/SubsetALambdaLoops/lcalsALambda.test:BM_VOL3D_CALC_LAMBDA/0    238.96    227.17  -4.9%
                                            test-suite :: MicroBenchmarks/LCALS/SubsetALambdaLoops/lcalsALambda.test:BM_ENERGY_CALC_LAMBDA/44217    252.76    240.28  -4.9%
                                                  test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<15, LessThanZero, None>    296.84    282.16  -4.9%
                                                 test-suite :: MicroBenchmarks/LCALS/SubsetBLambdaLoops/lcalsBLambda.test:BM_IF_QUAD_LAMBDA/5001     14.15     13.45  -5.0%
                                                                            test-suite :: SingleSource/Benchmarks/Adobe-C++/functionobjects.test      1.97      1.87  -5.0%
                                                                      test-suite :: MultiSource/Benchmarks/TSVC/Searching-flt/Searching-flt.test      1.73      1.64  -5.0%
                                 test-suite :: MicroBenchmarks/LoopVectorization/LoopVectorizationBenchmarks.test:BENCHMARK_cbrt_autovec_double_    258.44    245.50  -5.0%
                                                  test-suite :: MultiSource/Benchmarks/TSVC/StatementReordering-flt/StatementReordering-flt.test      1.45      1.38  -5.0%
                                                        test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<16, EqZero, None>    185.12    175.83  -5.0%
                                                      test-suite :: MicroBenchmarks/LCALS/SubsetARawLoops/lcalsARaw.test:BM_ENERGY_CALC_RAW/5001     20.50     19.47  -5.0%
                                                 test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_ORDERED_DITHER/128/4     40.24     38.21  -5.0%
                                                          test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<4, EqZero, Mid>    564.19    535.51  -5.1%
                                                     test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_FLOYD_DITHER/128     95.77     90.89  -5.1%
                                                                             test-suite :: SingleSource/Benchmarks/Shootout/Shootout-matrix.test      0.85      0.80  -5.1%
                test-suite :: MicroBenchmarks/LoopVectorization/LoopVectorizationBenchmarks.test:benchVecWithRuntimeChecks4PointersDBeforeA/1000 811740.28 770247.01  -5.1%
                                                 test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_ORDERED_DITHER/128/2     35.98     34.14  -5.1%
                                          test-suite :: MicroBenchmarks/LCALS/SubsetCLambdaLoops/lcalsCLambda.test:BM_FIND_FIRST_MIN_LAMBDA/5001      1.91      1.81  -5.1%
                                                            test-suite :: MultiSource/Benchmarks/TSVC/GlobalDataFlow-dbl/GlobalDataFlow-dbl.test      1.20      1.14  -5.1%
                                                      test-suite :: MultiSource/Benchmarks/TSVC/LoopRestructuring-flt/LoopRestructuring-flt.test      1.64      1.55  -5.1%
                                                                    test-suite :: MultiSource/Benchmarks/TSVC/Reductions-dbl/Reductions-dbl.test      2.45      2.32  -5.1%
                                               test-suite :: MicroBenchmarks/LCALS/SubsetCLambdaLoops/lcalsCLambda.test:BM_FIRST_SUM_LAMBDA/5001      4.49      4.26  -5.2%
                                                    test-suite :: MicroBenchmarks/LCALS/SubsetCRawLoops/lcalsCRaw.test:BM_GEN_LIN_RECUR_RAW/5001     27.04     25.64  -5.2%
                                              test-suite :: MicroBenchmarks/LCALS/SubsetCLambdaLoops/lcalsCLambda.test:BM_FIRST_SUM_LAMBDA/44217     39.83     37.77  -5.2%
                                                test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<1, GreaterThanZero, Last>   2212.54   2098.17  -5.2%
                                    test-suite :: MicroBenchmarks/LoopVectorization/LoopVectorizationBenchmarks.test:BENCHMARK_cosf_novec_float_    128.49    121.85  -5.2%
                                                        test-suite :: MicroBenchmarks/LCALS/SubsetBRawLoops/lcalsBRaw.test:BM_TRAP_INT_RAW/44217    103.65     98.28  -5.2%
                                                      test-suite :: MicroBenchmarks/LCALS/SubsetARawLoops/lcalsARaw.test:BM_DEL_DOT_VEC_2D_RAW/1     25.43     24.12  -5.2%
     test-suite :: MicroBenchmarks/SLPVectorization/SLPVectorizationBenchmarks.test:benchmark_multiply_accumulate_runtime_checks_fail<3, double>      3.08      2.92  -5.2%
                                                         test-suite :: MicroBenchmarks/LCALS/SubsetCRawLoops/lcalsCRaw.test:BM_PLANCKIAN_RAW/171      0.82      0.78  -5.2%
                                                                                   test-suite :: SingleSource/Benchmarks/Misc-C++/Large/ray.test      1.67      1.58  -5.2%
                                               test-suite :: MicroBenchmarks/LCALS/SubsetBLambdaLoops/lcalsBLambda.test:BM_MULADDSUB_LAMBDA/5001      8.15      7.72  -5.2%
                                                                     test-suite :: SingleSource/Benchmarks/Shootout-C++/Shootout-C++-random.test      1.45      1.37  -5.2%
                                                        test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<7, EqZero, First>    431.99    409.43  -5.2%
                                  test-suite :: MicroBenchmarks/LoopVectorization/LoopVectorizationBenchmarks.test:BENCHMARK_sin_autovec_double_    261.28    247.64  -5.2%
                                                 test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<1, GreaterThanZero, Mid>   2207.70   2092.37  -5.2%
                                                           test-suite :: MicroBenchmarks/ImageProcessing/Dilate/Dilate.test:BENCHMARK_DILATE/256    114.74    108.74  -5.2%
                                                          test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<2, EqZero, Mid>   1112.95   1054.74  -5.2%
                                                test-suite :: MicroBenchmarks/LCALS/SubsetCLambdaLoops/lcalsCLambda.test:BM_MAT_X_MAT_LAMBDA/171     67.86     64.31  -5.2%
                                                        test-suite :: MicroBenchmarks/LCALS/SubsetBRawLoops/lcalsBRaw.test:BM_MULADDSUB_RAW/5001      8.37      7.93  -5.2%
                                                              test-suite :: MultiSource/Benchmarks/TSVC/LoopRerolling-flt/LoopRerolling-flt.test      1.67      1.58  -5.2%
                                                          test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<8, EqZero, Mid>    285.55    270.55  -5.3%
                                                    test-suite :: MicroBenchmarks/LCALS/SubsetCLambdaLoops/lcalsCLambda.test:BM_EOS_LAMBDA/44217     42.20     39.96  -5.3%
     test-suite :: MicroBenchmarks/SLPVectorization/SLPVectorizationBenchmarks.test:benchmark_multiply_accumulate_runtime_checks_fail<4, double>      3.13      2.96  -5.3%
                                                      test-suite :: MicroBenchmarks/LCALS/SubsetCRawLoops/lcalsCRaw.test:BM_FIRST_DIFF_RAW/44217     11.05     10.46  -5.3%
                                                     test-suite :: MicroBenchmarks/LCALS/SubsetCLambdaLoops/lcalsCLambda.test:BM_EOS_LAMBDA/5001      4.75      4.49  -5.3%
                                            test-suite :: MicroBenchmarks/LCALS/SubsetCLambdaLoops/lcalsCLambda.test:BM_BAND_LIN_EQ_LAMBDA/44217     23.88     22.60  -5.3%
                                                                                 test-suite :: SingleSource/Benchmarks/Misc-C++/mandel-text.test      1.00      0.94  -5.3%
                                   test-suite :: MicroBenchmarks/LoopVectorization/LoopVectorizationBenchmarks.test:BENCHMARK_cbrtf_novec_float_    241.85    228.85  -5.4%
                                                    test-suite :: MultiSource/Benchmarks/TSVC/IndirectAddressing-flt/IndirectAddressing-flt.test      1.62      1.53  -5.4%
                                           test-suite :: MicroBenchmarks/LCALS/SubsetCLambdaLoops/lcalsCLambda.test:BM_GEN_LIN_RECUR_LAMBDA/5001     27.14     25.67  -5.4%
                                                      test-suite :: MultiSource/Benchmarks/TSVC/InductionVariable-dbl/InductionVariable-dbl.test      2.07      1.95  -5.4%
                                                           test-suite :: MicroBenchmarks/LCALS/SubsetCRawLoops/lcalsCRaw.test:BM_PIC_2D_RAW/5001     19.98     18.90  -5.4%
                                                 test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_ORDERED_DITHER/128/3     45.02     42.59  -5.4%
                                                 test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_ORDERED_DITHER/128/8     40.48     38.29  -5.4%
                                                 test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<7, GreaterThanZero, Mid>    549.08    519.22  -5.4%
                                    test-suite :: MicroBenchmarks/LoopVectorization/LoopVectorizationBenchmarks.test:BENCHMARK_sin_novec_double_    261.05    246.85  -5.4%
                                             test-suite :: MicroBenchmarks/LCALS/SubsetCLambdaLoops/lcalsCLambda.test:BM_FIRST_DIFF_LAMBDA/44217     10.09      9.54  -5.5%
                                                   test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<6, LessThanZero, None>    637.50    602.74  -5.5%
                                                                test-suite :: MultiSource/Benchmarks/TSVC/ControlLoops-flt/ControlLoops-flt.test      1.66      1.57  -5.5%
test-suite :: MicroBenchmarks/SLPVectorization/SLPVectorizationBenchmarks.test:benchmark_multiply_accumulate_no_runtime_checks_needed<3, double>      3.16      2.99  -5.5%
                                                                                             test-suite :: MultiSource/Applications/lua/lua.test      8.09      7.65  -5.5%
                                                        test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<6, EqZero, First>    505.49    477.77  -5.5%
                                                                  test-suite :: MultiSource/Benchmarks/TSVC/Recurrences-flt/Recurrences-flt.test      2.61      2.47  -5.5%
                                                test-suite :: MicroBenchmarks/LCALS/SubsetBLambdaLoops/lcalsBLambda.test:BM_TRAP_INT_LAMBDA/5001     11.89     11.24  -5.5%
                                                              test-suite :: MicroBenchmarks/LCALS/SubsetARawLoops/lcalsARaw.test:BM_FIR_RAW/5001      8.31      7.85  -5.5%
                                                      test-suite :: MicroBenchmarks/LCALS/SubsetCRawLoops/lcalsCRaw.test:BM_IMP_HYDRO_2D_RAW/171      4.34      4.10  -5.5%
                                                                                         test-suite :: MultiSource/Applications/spiff/spiff.test      0.71      0.67  -5.5%
                                                     test-suite :: MicroBenchmarks/LCALS/SubsetALambdaLoops/lcalsALambda.test:BM_COUPLE_LAMBDA/2      1.01      0.96  -5.5%
                                                         test-suite :: MicroBenchmarks/LCALS/SubsetBRawLoops/lcalsBRaw.test:BM_TRAP_INT_RAW/5001     11.74     11.10  -5.5%
                                   test-suite :: MicroBenchmarks/LoopVectorization/LoopVectorizationBenchmarks.test:BENCHMARK_sinhf_novec_float_    232.84    219.95  -5.5%
                                                            test-suite :: MicroBenchmarks/LCALS/SubsetBRawLoops/lcalsBRaw.test:BM_INIT3_RAW/5001      7.38      6.97  -5.5%
                                                     test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_FLOYD_DITHER/256    391.74    370.02  -5.5%
                                                                  test-suite :: MultiSource/Benchmarks/TSVC/Recurrences-dbl/Recurrences-dbl.test      2.61      2.47  -5.6%
                                                      test-suite :: MultiSource/Benchmarks/TSVC/LoopRestructuring-dbl/LoopRestructuring-dbl.test      1.83      1.73  -5.6%
                                                      test-suite :: MicroBenchmarks/LCALS/SubsetARawLoops/lcalsARaw.test:BM_DEL_DOT_VEC_2D_RAW/0    155.00    146.35  -5.6%
                                                                    test-suite :: MultiSource/Benchmarks/TSVC/Reductions-flt/Reductions-flt.test      4.01      3.79  -5.6%
                                                                      test-suite :: MultiSource/Benchmarks/TSVC/Symbolics-dbl/Symbolics-dbl.test      1.07      1.01  -5.7%
                                                         test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<64, EqZero, Mid>    141.58    133.50  -5.7%
                                                                                         test-suite :: SingleSource/Benchmarks/Misc/salsa20.test      3.08      2.90  -5.7%
                                                                      test-suite :: SingleSource/Benchmarks/Shootout-C++/Shootout-C++-hash2.test      1.18      1.11  -5.7%
                                                   test-suite :: MicroBenchmarks/LCALS/SubsetCRawLoops/lcalsCRaw.test:BM_FIND_FIRST_MIN_RAW/5001      1.92      1.81  -5.7%
                                            test-suite :: MicroBenchmarks/LCALS/SubsetCLambdaLoops/lcalsCLambda.test:BM_TRIDIAG_ELIM_LAMBDA/5001      9.03      8.51  -5.7%
                                             test-suite :: MicroBenchmarks/LCALS/SubsetCLambdaLoops/lcalsCLambda.test:BM_BAND_LIN_EQ_LAMBDA/5001      2.62      2.47  -5.7%
                                             test-suite :: MicroBenchmarks/LCALS/SubsetCLambdaLoops/lcalsCLambda.test:BM_INNER_PROD_LAMBDA/44217     90.32     85.10  -5.8%
                                                 test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_ORDERED_DITHER/256/4    161.67    152.30  -5.8%
                                               test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<15, GreaterThanZero, None>    323.91    305.01  -5.8%
                                                                  test-suite :: MultiSource/Benchmarks/TSVC/ControlFlow-flt/ControlFlow-flt.test      1.48      1.39  -5.9%
                                                                                   test-suite :: SingleSource/Benchmarks/Linpack/linpack-pc.test      0.85      0.80  -5.9%
test-suite :: MicroBenchmarks/SLPVectorization/SLPVectorizationBenchmarks.test:benchmark_multiply_accumulate_no_runtime_checks_needed<4, double>      3.19      3.00  -5.9%
                                                           test-suite :: MicroBenchmarks/ImageProcessing/Dilate/Dilate.test:BENCHMARK_DILATE/512    454.63    427.70  -5.9%
                                                                                test-suite :: SingleSource/Benchmarks/CoyoteBench/huffbench.test      6.12      5.76  -5.9%
                                          test-suite :: MicroBenchmarks/LCALS/SubsetCLambdaLoops/lcalsCLambda.test:BM_GEN_LIN_RECUR_LAMBDA/44217    241.10    226.81  -5.9%
                                                test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<4, GreaterThanZero, None>    842.75    792.73  -5.9%
                                             test-suite :: MicroBenchmarks/LCALS/SubsetCLambdaLoops/lcalsCLambda.test:BM_IMP_HYDRO_2D_LAMBDA/171      4.36      4.10  -5.9%
                                                              test-suite :: MicroBenchmarks/LCALS/SubsetARawLoops/lcalsARaw.test:BM_COUPLE_RAW/2      1.02      0.96  -5.9%
                                 test-suite :: MicroBenchmarks/LoopVectorization/LoopVectorizationBenchmarks.test:BENCHMARK_sinhf_autovec_float_    236.85    222.74  -6.0%
                                                              test-suite :: MicroBenchmarks/LCALS/SubsetARawLoops/lcalsARaw.test:BM_COUPLE_RAW/0    740.07    695.98  -6.0%
                                                        test-suite :: MicroBenchmarks/ImageProcessing/Blur/blur.test:BENCHMARK_boxBlurKernel/256    146.79    138.04  -6.0%
                                  test-suite :: MicroBenchmarks/LoopVectorization/LoopVectorizationBenchmarks.test:BENCHMARK_sinf_autovec_float_    132.29    124.39  -6.0%
                                                                      test-suite :: SingleSource/Benchmarks/Shootout-C++/Shootout-C++-sieve.test      0.93      0.88  -6.0%
                                                       test-suite :: MicroBenchmarks/LCALS/SubsetARawLoops/lcalsARaw.test:BM_ENERGY_CALC_RAW/171      0.69      0.65  -6.0%
                                                                            test-suite :: SingleSource/Benchmarks/Adobe-C++/stepanov_vector.test      1.49      1.40  -6.0%
                                                                                         test-suite :: MultiSource/Benchmarks/nbench/nbench.test      1.04      0.98  -6.0%
                                                                              test-suite :: MultiSource/Benchmarks/Trimaran/enc-md5/enc-md5.test      1.06      1.00  -6.0%
                                  test-suite :: MicroBenchmarks/LoopVectorization/LoopVectorizationBenchmarks.test:BENCHMARK_cos_autovec_double_    278.22    261.54  -6.0%
                                                     test-suite :: MicroBenchmarks/LCALS/SubsetCRawLoops/lcalsCRaw.test:BM_GEN_LIN_RECUR_RAW/171      0.93      0.88  -6.0%
                                   test-suite :: MicroBenchmarks/LoopVectorization/LoopVectorizationBenchmarks.test:BENCHMARK_cbrt_novec_double_    261.30    245.57  -6.0%
                                              test-suite :: MicroBenchmarks/LCALS/SubsetCLambdaLoops/lcalsCLambda.test:BM_INNER_PROD_LAMBDA/5001     10.17      9.56  -6.0%
                                                                                           test-suite :: MultiSource/Applications/siod/siod.test      0.83      0.78  -6.0%
                                    test-suite :: MicroBenchmarks/LoopVectorization/LoopVectorizationBenchmarks.test:BENCHMARK_erf_novec_double_    114.88    107.94  -6.0%
     test-suite :: MicroBenchmarks/SLPVectorization/SLPVectorizationBenchmarks.test:benchmark_multiply_accumulate_runtime_checks_pass<4, double>      2.27      2.13  -6.0%
                                                 test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_ORDERED_DITHER/256/8    161.89    152.04  -6.1%
                                                        test-suite :: MicroBenchmarks/LCALS/SubsetCRawLoops/lcalsCRaw.test:BM_FIRST_SUM_RAW/5001      4.53      4.25  -6.1%
                                               test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<2, GreaterThanZero, First>   1681.27   1578.81  -6.1%
                                    test-suite :: MicroBenchmarks/LoopVectorization/LoopVectorizationBenchmarks.test:BENCHMARK_cos_novec_double_    279.35    262.28  -6.1%
                                                         test-suite :: MicroBenchmarks/LCALS/SubsetBRawLoops/lcalsBRaw.test:BM_IF_QUAD_RAW/44217    136.01    127.70  -6.1%
test-suite :: MicroBenchmarks/SLPVectorization/SLPVectorizationBenchmarks.test:benchmark_multiply_accumulate_no_runtime_checks_needed<2, double>      3.19      2.99  -6.1%
                                                    test-suite :: MultiSource/Benchmarks/TSVC/CrossingThresholds-flt/CrossingThresholds-flt.test      1.16      1.09  -6.1%
                                                        test-suite :: MicroBenchmarks/ImageProcessing/Blur/blur.test:BENCHMARK_GAUSSIAN_BLUR/512  11678.81  10961.88  -6.1%
                                                                   test-suite :: MultiSource/Benchmarks/DOE-ProxyApps-C/SimpleMOC/SimpleMOC.test      0.81      0.76  -6.1%
                                                   test-suite :: MicroBenchmarks/LCALS/SubsetCRawLoops/lcalsCRaw.test:BM_GEN_LIN_RECUR_RAW/44217    241.98    227.09  -6.2%
                                                             test-suite :: MicroBenchmarks/LCALS/SubsetARawLoops/lcalsARaw.test:BM_FIR_RAW/44217     73.99     69.44  -6.2%
                                                 test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_ORDERED_DITHER/512/3    714.15    670.12  -6.2%
                                                 test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_ORDERED_DITHER/512/2    570.02    534.85  -6.2%
                                                          test-suite :: MicroBenchmarks/LCALS/SubsetBRawLoops/lcalsBRaw.test:BM_IF_QUAD_RAW/5001     14.65     13.75  -6.2%
                                                                                           test-suite :: MultiSource/Benchmarks/Olden/bh/bh.test      0.78      0.73  -6.2%
     test-suite :: MicroBenchmarks/SLPVectorization/SLPVectorizationBenchmarks.test:benchmark_multiply_accumulate_runtime_checks_fail<2, double>      3.08      2.88  -6.2%
                                                          test-suite :: MicroBenchmarks/LCALS/SubsetCRawLoops/lcalsCRaw.test:BM_DISC_ORD_RAW/171      3.13      2.94  -6.2%
                                                        test-suite :: MicroBenchmarks/ImageProcessing/Blur/blur.test:BENCHMARK_boxBlurKernel/128     35.49     33.28  -6.2%
                                                       test-suite :: MicroBenchmarks/LCALS/SubsetCRawLoops/lcalsCRaw.test:BM_FIRST_SUM_RAW/44217     40.41     37.89  -6.2%
                                                                  test-suite :: SingleSource/Benchmarks/Shootout-C++/Shootout-C++-ackermann.test      2.06      1.93  -6.2%
                                 test-suite :: MicroBenchmarks/LoopVectorization/LoopVectorizationBenchmarks.test:BENCHMARK_atan_autovec_double_    155.69    145.97  -6.2%
                                 test-suite :: MicroBenchmarks/LoopVectorization/LoopVectorizationBenchmarks.test:BENCHMARK_cbrtf_autovec_float_    243.56    228.34  -6.3%
                                                     test-suite :: MicroBenchmarks/LCALS/SubsetCRawLoops/lcalsCRaw.test:BM_IMP_HYDRO_2D_RAW/5001    137.01    128.45  -6.3%
                                               test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<1, GreaterThanZero, First>   2221.50   2082.47  -6.3%
                                                 test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_ORDERED_DITHER/512/8    640.55    600.38  -6.3%
                                                test-suite :: MicroBenchmarks/LCALS/SubsetCLambdaLoops/lcalsCLambda.test:BM_DISC_ORD_LAMBDA/5001     94.65     88.70  -6.3%
                                                 test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<2, GreaterThanZero, Mid>   1682.95   1577.09  -6.3%
                                                                            test-suite :: MultiSource/Benchmarks/Trimaran/enc-3des/enc-3des.test      1.10      1.03  -6.3%
                                                       test-suite :: MicroBenchmarks/ImageProcessing/Blur/blur.test:BENCHMARK_GAUSSIAN_BLUR/1024  47565.56  44555.24  -6.3%
                                            test-suite :: MicroBenchmarks/LCALS/SubsetCLambdaLoops/lcalsCLambda.test:BM_GEN_LIN_RECUR_LAMBDA/171      0.94      0.88  -6.4%
                                                             test-suite :: MultiSource/Benchmarks/DOE-ProxyApps-C++/HACCKernels/HACCKernels.test      1.05      0.98  -6.4%
                                                          test-suite :: MicroBenchmarks/LCALS/SubsetARawLoops/lcalsARaw.test:BM_VOL3D_CALC_RAW/2      2.00      1.87  -6.4%
                                                                              test-suite :: SingleSource/Benchmarks/Shootout/Shootout-sieve.test      1.78      1.66  -6.4%
                                                                    test-suite :: MultiSource/Benchmarks/Trimaran/netbench-url/netbench-url.test      2.23      2.08  -6.4%
                                                                                        test-suite :: SingleSource/Benchmarks/Misc/oourafft.test      1.35      1.26  -6.4%
                                                                                   test-suite :: SingleSource/Benchmarks/Misc-C++-EH/spirit.test      3.59      3.36  -6.4%
                                                 test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<5, GreaterThanZero, Mid>    786.12    735.26  -6.5%
                                                     test-suite :: MicroBenchmarks/LCALS/SubsetALambdaLoops/lcalsALambda.test:BM_COUPLE_LAMBDA/1    127.83    119.53  -6.5%
                                   test-suite :: MicroBenchmarks/LoopVectorization/LoopVectorizationBenchmarks.test:BENCHMARK_atan_novec_double_    156.24    146.04  -6.5%
                                   test-suite :: MicroBenchmarks/LoopVectorization/LoopVectorizationBenchmarks.test:BENCHMARK_sinh_novec_double_    233.32    218.07  -6.5%
                                                 test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_ORDERED_DITHER/512/4    642.74    600.61  -6.6%
                                                     test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_FLOYD_DITHER/512   1607.78   1501.97  -6.6%
                                                 test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<15, LessThanZero, First>    262.65    245.37  -6.6%
                                                        test-suite :: MicroBenchmarks/ImageProcessing/Blur/blur.test:BENCHMARK_GAUSSIAN_BLUR/256   2844.35   2657.04  -6.6%
                                                  test-suite :: MicroBenchmarks/LCALS/SubsetCLambdaLoops/lcalsCLambda.test:BM_PIC_1D_LAMBDA/5001     32.84     30.67  -6.6%
                                                                           test-suite :: SingleSource/Benchmarks/Misc-C++/Large/sphereflake.test      2.07      1.93  -6.6%
                                                     test-suite :: MicroBenchmarks/LCALS/SubsetALambdaLoops/lcalsALambda.test:BM_COUPLE_LAMBDA/0    745.20    695.91  -6.6%
                                                  test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<2, LessThanZero, First>   1453.45   1357.09  -6.6%
                                                        test-suite :: MicroBenchmarks/ImageProcessing/Blur/blur.test:BENCHMARK_GAUSSIAN_BLUR/128    666.23    621.84  -6.7%
                                                test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<1, GreaterThanZero, None>   2227.11   2078.54  -6.7%
                                                  test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<16, LessThanZero, Last>    351.29    327.78  -6.7%
                                  test-suite :: MicroBenchmarks/LoopVectorization/LoopVectorizationBenchmarks.test:BENCHMARK_erff_autovec_float_    115.49    107.76  -6.7%
                                           test-suite :: MicroBenchmarks/LCALS/SubsetCLambdaLoops/lcalsCLambda.test:BM_TRIDIAG_ELIM_LAMBDA/44217     81.04     75.60  -6.7%
                                                                test-suite :: MultiSource/Benchmarks/TSVC/ControlLoops-dbl/ControlLoops-dbl.test      1.79      1.67  -6.7%
                                                        test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<2, EqZero, First>   1131.45   1055.18  -6.7%
                                                        test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<1, EqZero, First>   2262.30   2109.60  -6.7%
                                   test-suite :: MicroBenchmarks/LoopVectorization/LoopVectorizationBenchmarks.test:BENCHMARK_atanf_novec_float_    136.69    127.43  -6.8%
                                  test-suite :: MicroBenchmarks/LoopVectorization/LoopVectorizationBenchmarks.test:BENCHMARK_erf_autovec_double_    121.76    113.44  -6.8%
                        test-suite :: MicroBenchmarks/SLPVectorization/SLPVectorizationBenchmarks.test:benchmark_xor_runtime_checks_pass<4, int>      1.67      1.56  -6.8%
                                                test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<2, GreaterThanZero, None>   1688.82   1573.30  -6.8%
                                                  test-suite :: MicroBenchmarks/LCALS/SubsetBLambdaLoops/lcalsBLambda.test:BM_INIT3_LAMBDA/44217     71.41     66.52  -6.8%
                                                              test-suite :: MultiSource/Benchmarks/TSVC/NodeSplitting-flt/NodeSplitting-flt.test      1.46      1.36  -6.9%
                                                          test-suite :: MicroBenchmarks/LCALS/SubsetCRawLoops/lcalsCRaw.test:BM_HYDRO_2D_RAW/171      7.89      7.34  -6.9%
                                                                             test-suite :: MultiSource/Benchmarks/DOE-ProxyApps-C/CoMD/CoMD.test      0.80      0.74  -6.9%
                               test-suite :: MicroBenchmarks/ImageProcessing/Interpolation/Interpolation.test:BENCHMARK_BICUBIC_INTERPOLATION/32    171.01    159.16  -6.9%
                                    test-suite :: MicroBenchmarks/LoopVectorization/LoopVectorizationBenchmarks.test:BENCHMARK_sinf_novec_float_    130.93    121.80  -7.0%
                              test-suite :: MicroBenchmarks/ImageProcessing/Interpolation/Interpolation.test:BENCHMARK_BICUBIC_INTERPOLATION/128   3185.36   2962.97  -7.0%
                                                       test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<16, EqZero, First>    189.59    176.34  -7.0%
                                                 test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_ORDERED_DITHER/256/3    182.15    169.40  -7.0%
                                            test-suite :: MicroBenchmarks/LCALS/SubsetCLambdaLoops/lcalsCLambda.test:BM_IMP_HYDRO_2D_LAMBDA/5001    137.68    128.03  -7.0%
                                                   test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<15, LessThanZero, Mid>    263.11    244.60  -7.0%
                                                        test-suite :: MicroBenchmarks/LCALS/SubsetCRawLoops/lcalsCRaw.test:BM_DISC_ORD_RAW/44217    845.24    785.77  -7.0%
                                                         test-suite :: MicroBenchmarks/LCALS/SubsetCRawLoops/lcalsCRaw.test:BM_DISC_ORD_RAW/5001     95.55     88.82  -7.0%
                                             test-suite :: MicroBenchmarks/LCALS/SubsetALambdaLoops/lcalsALambda.test:BM_DEL_DOT_VEC_2D_LAMBDA/1     25.79     23.97  -7.0%
                                                                test-suite :: SingleSource/Benchmarks/Adobe-C++/simple_types_loop_invariant.test      0.83      0.77  -7.0%
                                                         test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<4, EqZero, Last>    576.58    535.82  -7.1%
                                                         test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/syr2k/syr2k.test      4.07      3.78  -7.1%
                                                  test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<63, LessThanZero, None>    128.00    118.90  -7.1%
                                                                                         test-suite :: SingleSource/Benchmarks/Misc/flops-3.test      0.73      0.67  -7.1%
                                                                                           test-suite :: SingleSource/Benchmarks/Misc/flops.test      3.02      2.80  -7.1%
                                                 test-suite :: MicroBenchmarks/LCALS/SubsetCLambdaLoops/lcalsCLambda.test:BM_HYDRO_2D_LAMBDA/171      7.90      7.33  -7.2%
                                                   test-suite :: MicroBenchmarks/LCALS/SubsetARawLoops/lcalsARaw.test:BM_PRESSURE_CALC_RAW/44217     60.66     56.28  -7.2%
                                                   test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<4, LessThanZero, None>    738.63    685.07  -7.3%
                                                    test-suite :: MicroBenchmarks/LCALS/SubsetARawLoops/lcalsARaw.test:BM_PRESSURE_CALC_RAW/5001      6.48      6.00  -7.3%
                                    test-suite :: MicroBenchmarks/Builtins/Int128/Builtins.test:BM_DivideIntrinsic128UniformDivisor<__uint128_t>     10.96     10.16  -7.3%
                                    test-suite :: MicroBenchmarks/LoopVectorization/LoopVectorizationBenchmarks.test:BENCHMARK_erff_novec_float_    113.93    105.57  -7.3%
                                 test-suite :: MicroBenchmarks/LoopVectorization/LoopVectorizationBenchmarks.test:BENCHMARK_sinh_autovec_double_    242.72    224.90  -7.3%
                                               test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<15, GreaterThanZero, Last>    391.07    362.28  -7.4%
                              test-suite :: MicroBenchmarks/ImageProcessing/Interpolation/Interpolation.test:BENCHMARK_BICUBIC_INTERPOLATION/256  13132.98  12159.48  -7.4%
                                                                              test-suite :: MultiSource/Benchmarks/VersaBench/ecbdes/ecbdes.test      1.28      1.19  -7.4%
                                                                       test-suite :: MultiSource/Benchmarks/ASC_Sequoia/CrystalMk/CrystalMk.test      1.54      1.42  -7.5%
                                                     test-suite :: MicroBenchmarks/LCALS/SubsetALambdaLoops/lcalsALambda.test:BM_FIR_LAMBDA/5001      8.44      7.81  -7.5%
                                                 test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<63, LessThanZero, First>    108.98    100.79  -7.5%
                                                                                         test-suite :: MultiSource/Applications/SPASS/SPASS.test      3.61      3.33  -7.7%
                                   test-suite :: MicroBenchmarks/LoopVectorization/LoopVectorizationBenchmarks.test:BENCHMARK_asinf_novec_float_    110.65    102.13  -7.7%
                                                                                     test-suite :: MultiSource/Applications/obsequi/Obsequi.test      0.95      0.87  -7.7%
                                                                              test-suite :: SingleSource/Benchmarks/Shootout/Shootout-lists.test      2.63      2.43  -7.7%
                                                   test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<6, LessThanZero, Last>    813.42    750.36  -7.8%
                                 test-suite :: MicroBenchmarks/LoopVectorization/LoopVectorizationBenchmarks.test:BENCHMARK_asin_autovec_double_     91.87     84.65  -7.9%
                               test-suite :: MicroBenchmarks/ImageProcessing/Interpolation/Interpolation.test:BENCHMARK_BICUBIC_INTERPOLATION/64    767.38    706.91  -7.9%
                              test-suite :: MicroBenchmarks/ImageProcessing/Interpolation/Interpolation.test:BENCHMARK_BILINEAR_INTERPOLATION/32     43.04     39.64  -7.9%
                                              test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<16, GreaterThanZero, First>    277.96    255.95  -7.9%
                                             test-suite :: MicroBenchmarks/LCALS/SubsetALambdaLoops/lcalsALambda.test:BM_DEL_DOT_VEC_2D_LAMBDA/0    157.81    145.31  -7.9%
                                                   test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<1, LessThanZero, Last>   2263.00   2083.60  -7.9%
                                 test-suite :: MicroBenchmarks/LoopVectorization/LoopVectorizationBenchmarks.test:BENCHMARK_atanf_autovec_float_    140.44    129.25  -8.0%
                              test-suite :: MicroBenchmarks/ImageProcessing/Interpolation/Interpolation.test:BENCHMARK_BILINEAR_INTERPOLATION/16     10.62      9.77  -8.0%
                                                          test-suite :: MicroBenchmarks/LCALS/SubsetCRawLoops/lcalsCRaw.test:BM_PIC_1D_RAW/44217    309.01    284.31  -8.0%
                                                test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<64, GreaterThanZero, Mid>    154.94    142.46  -8.1%
                                                      test-suite :: MicroBenchmarks/LCALS/SubsetCRawLoops/lcalsCRaw.test:BM_INNER_PROD_RAW/44217     92.61     85.10  -8.1%
                                   test-suite :: MicroBenchmarks/LoopVectorization/LoopVectorizationBenchmarks.test:BENCHMARK_asin_novec_double_     92.76     85.18  -8.2%
                                                                                        test-suite :: SingleSource/Benchmarks/McGill/queens.test      1.26      1.16  -8.2%
                                                        test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<63, EqZero, Last>    156.77    143.84  -8.3%
                                                            test-suite :: MicroBenchmarks/LCALS/SubsetCRawLoops/lcalsCRaw.test:BM_PIC_1D_RAW/171      0.94      0.87  -8.3%
                                           test-suite :: MicroBenchmarks/LCALS/SubsetCLambdaLoops/lcalsCLambda.test:BM_IMP_HYDRO_2D_LAMBDA/44217   1243.26   1140.42  -8.3%
                                                           test-suite :: MicroBenchmarks/LCALS/SubsetBRawLoops/lcalsBRaw.test:BM_INIT3_RAW/44217     72.18     66.20  -8.3%
                                                                       test-suite :: MultiSource/Benchmarks/FreeBench/fourinarow/fourinarow.test      0.92      0.84  -8.3%
                                                                                    test-suite :: MultiSource/Applications/JM/lencod/lencod.test      2.27      2.08  -8.3%
                                                  test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<1, LessThanZero, First>   2258.76   2070.46  -8.3%
                                     test-suite :: MicroBenchmarks/Builtins/Int128/Builtins.test:BM_DivideIntrinsic128UniformDivisor<__int128_t>     12.82     11.75  -8.4%
                                                                       test-suite :: MicroBenchmarks/harris/harris.test:BENCHMARK_HARRIS/256/256    272.57    249.45  -8.5%
                                                test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<63, GreaterThanZero, Mid>    123.43    112.96  -8.5%
                                   test-suite :: MicroBenchmarks/LoopVectorization/LoopVectorizationBenchmarks.test:BENCHMARK_acos_novec_double_     93.66     85.68  -8.5%
                                                                          test-suite :: MultiSource/Benchmarks/TSVC/Packing-dbl/Packing-dbl.test      1.56      1.43  -8.5%
                                                test-suite :: MicroBenchmarks/LCALS/SubsetCLambdaLoops/lcalsCLambda.test:BM_HYDRO_2D_LAMBDA/5001    247.82    226.68  -8.5%
                                               test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<64, GreaterThanZero, Last>    154.07    140.83  -8.6%
                                                           test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/syrk/syrk.test      2.01      1.84  -8.6%
                                                     test-suite :: MicroBenchmarks/LCALS/SubsetCLambdaLoops/lcalsCLambda.test:BM_ADI_LAMBDA/5001     65.25     59.62  -8.6%
                                                                                     test-suite :: MultiSource/Applications/minisat/minisat.test      2.83      2.59  -8.6%
                                                         test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<63, EqZero, Mid>    123.46    112.80  -8.6%
                                                       test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<64, EqZero, First>    121.84    111.29  -8.7%
                                                    test-suite :: MicroBenchmarks/LCALS/SubsetALambdaLoops/lcalsALambda.test:BM_FIR_LAMBDA/44217     75.70     69.13  -8.7%
                                                    test-suite :: MicroBenchmarks/LCALS/SubsetCRawLoops/lcalsCRaw.test:BM_IMP_HYDRO_2D_RAW/44217   1252.71   1143.63  -8.7%
                                   test-suite :: MicroBenchmarks/LoopVectorization/LoopVectorizationBenchmarks.test:BENCHMARK_acosf_novec_float_    112.90    103.05  -8.7%
                                                   test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<63, LessThanZero, Mid>    109.72    100.08  -8.8%
                                       test-suite :: MicroBenchmarks/Builtins/Int128/Builtins.test:BM_DivideIntrinsic128SmallDivisor<__int128_t>      7.21      6.58  -8.8%
                                                                                test-suite :: SingleSource/Benchmarks/CoyoteBench/almabench.test      2.97      2.70  -8.8%
                                                         test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<32, EqZero, Mid>    115.98    105.71  -8.9%
                                                  test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<4, LessThanZero, First>    752.12    684.80  -9.0%
                                    test-suite :: MicroBenchmarks/Builtins/Int128/Builtins.test:BM_RemainderIntrinsic128SmallDivisor<__int128_t>      7.20      6.55  -9.0%
                                                 test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_ORDERED_DITHER/256/2    148.69    135.31  -9.0%
                                                                                     test-suite :: MultiSource/Applications/sqlite3/sqlite3.test      1.25      1.14  -9.0%
                                 test-suite :: MicroBenchmarks/LoopVectorization/LoopVectorizationBenchmarks.test:BENCHMARK_asinf_autovec_float_    113.13    102.89  -9.0%
                                                   test-suite :: MicroBenchmarks/LCALS/SubsetCLambdaLoops/lcalsCLambda.test:BM_PIC_1D_LAMBDA/171      0.95      0.86  -9.1%
                                                                                      test-suite :: MultiSource/Benchmarks/llubenchmark/llu.test      2.86      2.60  -9.1%
                                                     test-suite :: MicroBenchmarks/LCALS/SubsetCRawLoops/lcalsCRaw.test:BM_INT_PREDICT_RAW/44217    248.96    226.28  -9.1%
                                                        test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<4, EqZero, First>    589.29    535.51  -9.1%
                                                                                         test-suite :: SingleSource/Benchmarks/Misc/flops-5.test      0.68      0.61  -9.2%
                                                        test-suite :: MicroBenchmarks/ImageProcessing/Blur/blur.test:BENCHMARK_boxBlurKernel/512    629.89    572.01  -9.2%
                                               test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<64, GreaterThanZero, None>    145.66    132.26  -9.2%
                                               test-suite :: MicroBenchmarks/LCALS/SubsetCLambdaLoops/lcalsCLambda.test:BM_DISC_ORD_LAMBDA/44217    864.64    784.42  -9.3%
                                 test-suite :: MicroBenchmarks/LoopVectorization/LoopVectorizationBenchmarks.test:BENCHMARK_acos_autovec_double_     94.47     85.69  -9.3%
                                                        test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<32, EqZero, None>    116.83    105.97  -9.3%
                                                   test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<1, LessThanZero, None>   2280.75   2068.17  -9.3%
                                              test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<15, GreaterThanZero, First>    301.93    273.39  -9.5%
                                               test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<63, GreaterThanZero, Last>    158.76    143.71  -9.5%
                                      test-suite :: MicroBenchmarks/Builtins/Int128/Builtins.test:BM_DivideIntrinsic128SmallDivisor<__uint128_t>      6.11      5.53  -9.5%
                                                  test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<8, LessThanZero, First>    444.01    401.72  -9.5%
                                                        test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<8, EqZero, First>    297.85    269.45  -9.5%
                                                  test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<63, LessThanZero, Last>    160.57    144.96  -9.7%
                                                        test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<64, EqZero, Last>    156.18    140.96  -9.7%
                                                 test-suite :: MicroBenchmarks/LCALS/SubsetCLambdaLoops/lcalsCLambda.test:BM_PIC_1D_LAMBDA/44217    310.12    279.73  -9.8%
                                  test-suite :: MicroBenchmarks/Builtins/Int128/Builtins.test:BM_RemainderIntrinsic128UniformDivisor<__int128_t>      8.07      7.28  -9.8%
                                 test-suite :: MicroBenchmarks/Builtins/Int128/Builtins.test:BM_RemainderIntrinsic128UniformDivisor<__uint128_t>      8.18      7.38  -9.8%
                                                       test-suite :: MicroBenchmarks/ImageProcessing/Blur/blur.test:BENCHMARK_boxBlurKernel/1024   2577.41   2322.58  -9.9%
                                                  test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<32, LessThanZero, None>    248.51    223.91  -9.9%
                                                        test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<31, EqZero, Last>    122.57    110.37 -10.0%
                                                         test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<1, EqZero, None>   2276.43   2049.32 -10.0%
                                                        test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<63, EqZero, None>    136.81    123.11 -10.0%
                                                                       test-suite :: MultiSource/Benchmarks/DOE-ProxyApps-C++/miniFE/miniFE.test      1.94      1.74 -10.0%
                                              test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<63, GreaterThanZero, First>    127.45    114.63 -10.1%
                                                          test-suite :: MicroBenchmarks/ImageProcessing/Dilate/Dilate.test:BENCHMARK_DILATE/1024   1906.53   1714.65 -10.1%
                                   test-suite :: MicroBenchmarks/Builtins/Int128/Builtins.test:BM_RemainderIntrinsic128SmallDivisor<__uint128_t>      6.10      5.49 -10.1%
                                                        test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<15, EqZero, Last>    222.36    199.91 -10.1%
                                              test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<64, GreaterThanZero, First>    122.73    110.34 -10.1%
                                                                       test-suite :: MultiSource/Benchmarks/DOE-ProxyApps-C/XSBench/XSBench.test      1.51      1.36 -10.1%
                                                       test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<63, EqZero, First>    126.65    113.79 -10.2%
                                                     test-suite :: MicroBenchmarks/LCALS/SubsetCRawLoops/lcalsCRaw.test:BM_BAND_LIN_EQ_RAW/44217     25.18     22.62 -10.2%
                                                  test-suite :: MicroBenchmarks/LCALS/SubsetCLambdaLoops/lcalsCLambda.test:BM_PIC_2D_LAMBDA/5001     20.89     18.77 -10.2%
                                                    test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<1, LessThanZero, Mid>   2313.86   2077.80 -10.2%
                              test-suite :: MicroBenchmarks/ImageProcessing/Interpolation/Interpolation.test:BENCHMARK_BILINEAR_INTERPOLATION/64    177.49    159.37 -10.2%
                                                         test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<31, EqZero, Mid>    123.04    110.48 -10.2%
                                                   test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<64, LessThanZero, Mid>    143.53    128.85 -10.2%
                                                     test-suite :: MicroBenchmarks/LCALS/SubsetCRawLoops/lcalsCRaw.test:BM_TRIDIAG_ELIM_RAW/5001      9.47      8.50 -10.3%
                                                                                      test-suite :: MultiSource/Benchmarks/NPB-serial/is/is.test      3.78      3.39 -10.3%
                                                                               test-suite :: MultiSource/Benchmarks/ASC_Sequoia/AMGmk/AMGmk.test      3.80      3.41 -10.3%
                                                  test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<31, LessThanZero, None>    321.49    288.33 -10.3%
                                                                                   test-suite :: MultiSource/Benchmarks/7zip/7zip-benchmark.test      5.47      4.90 -10.3%
                                                       test-suite :: MicroBenchmarks/LCALS/SubsetCRawLoops/lcalsCRaw.test:BM_INNER_PROD_RAW/5001     10.72      9.60 -10.4%
                                                                      test-suite :: SingleSource/Benchmarks/Shootout-C++/Shootout-C++-lists.test      1.64      1.46 -10.5%
                               test-suite :: MicroBenchmarks/ImageProcessing/Interpolation/Interpolation.test:BENCHMARK_BICUBIC_INTERPOLATION/16     35.87     32.09 -10.5%
                                                   test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<32, LessThanZero, Mid>    210.07    187.68 -10.7%
                                                        test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<32, EqZero, Last>    118.93    106.24 -10.7%
                                                 test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<64, LessThanZero, First>    123.51    110.25 -10.7%
                                                                                    test-suite :: MultiSource/Benchmarks/VersaBench/bmm/bmm.test      0.84      0.75 -10.8%
                                                test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<32, GreaterThanZero, Mid>    241.14    215.13 -10.8%
                                                           test-suite :: MicroBenchmarks/LCALS/SubsetCRawLoops/lcalsCRaw.test:BM_PIC_1D_RAW/5001     34.35     30.63 -10.8%
                                                  test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<64, LessThanZero, None>    130.40    116.27 -10.8%
                                                          test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<6, EqZero, Mid>    535.60    477.05 -10.9%
                                                    test-suite :: MicroBenchmarks/LCALS/SubsetCLambdaLoops/lcalsCLambda.test:BM_ICCG_LAMBDA/5001      4.13      3.67 -10.9%
                                                 test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<8, GreaterThanZero, Mid>    508.31    452.51 -11.0%
                                                                           test-suite :: SingleSource/Benchmarks/Shootout/Shootout-heapsort.test      1.74      1.55 -11.0%
                                                         test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<8, EqZero, Last>    303.43    269.96 -11.0%
                                                  test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<64, LessThanZero, Last>    159.09    141.34 -11.2%
                                                         test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<1, EqZero, Last>   2323.33   2063.62 -11.2%
                                                                        test-suite :: MultiSource/Benchmarks/Fhourstones-3.1/fhourstones3.1.test      0.75      0.66 -11.3%
                                                     test-suite :: MicroBenchmarks/LCALS/SubsetCRawLoops/lcalsCRaw.test:BM_DIFF_PREDICT_RAW/5001     27.57     24.45 -11.3%
                                                                   test-suite :: SingleSource/Benchmarks/Shootout-C++/Shootout-C++-heapsort.test      1.76      1.56 -11.3%
                                                      test-suite :: MicroBenchmarks/LCALS/SubsetCRawLoops/lcalsCRaw.test:BM_BAND_LIN_EQ_RAW/5001      2.78      2.46 -11.3%
                                                test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<31, GreaterThanZero, Mid>    340.61    301.92 -11.4%
                                                                                           test-suite :: MultiSource/Benchmarks/PAQ8p/paq8p.test     20.01     17.71 -11.5%
                                                         test-suite :: MicroBenchmarks/LCALS/SubsetCRawLoops/lcalsCRaw.test:BM_HYDRO_2D_RAW/5001    261.28    231.05 -11.6%
                                                       test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<32, EqZero, First>    120.77    106.71 -11.6%
                                               test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<16, GreaterThanZero, None>    323.32    285.56 -11.7%
                                                                     test-suite :: MicroBenchmarks/harris/harris.test:BENCHMARK_HARRIS/2048/2048  36277.46  32024.86 -11.7%
                                                                           test-suite :: MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg2000.test      1.07      0.94 -11.8%
                                                               test-suite :: Bitcode/Benchmarks/Halide/bilateral_grid/halide_bilateral_grid.test     18.81     16.59 -11.8%
               test-suite :: MicroBenchmarks/SLPVectorization/SLPVectorizationBenchmarks.test:benchmark_add_xor_no_runtime_checks_needed<4, int>      1.68      1.47 -12.1%
                                                       test-suite :: MicroBenchmarks/LCALS/SubsetBRawLoops/lcalsBRaw.test:BM_MULADDSUB_RAW/44217     85.13     74.78 -12.2%
                                                     test-suite :: MicroBenchmarks/LCALS/SubsetARawLoops/lcalsARaw.test:BM_ENERGY_CALC_RAW/44217    224.61    197.25 -12.2%
                                                       test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<31, EqZero, First>    126.71    111.27 -12.2%
                                                             test-suite :: Bitcode/Benchmarks/Halide/local_laplacian/halide_local_laplacian.test     15.20     13.34 -12.2%
                                                 test-suite :: MicroBenchmarks/LCALS/SubsetCLambdaLoops/lcalsCLambda.test:BM_PIC_2D_LAMBDA/44217    273.61    240.09 -12.3%
                                                                     test-suite :: MicroBenchmarks/harris/harris.test:BENCHMARK_HARRIS/1024/1024  11221.61   9841.26 -12.3%
                                                  test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<31, LessThanZero, Last>    370.31    324.74 -12.3%
                                                test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<16, GreaterThanZero, Mid>    387.15    339.27 -12.4%
                                                                test-suite :: MicroBenchmarks/LoopInterchange/LoopInterchange.test:BENCHMARK_LI1    103.98     91.01 -12.5%
                                                          test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<1, EqZero, Mid>   2351.16   2057.14 -12.5%
                                                         test-suite :: MicroBenchmarks/LCALS/SubsetCRawLoops/lcalsCRaw.test:BM_HYDRO_1D_RAW/5001      1.71      1.49 -12.7%
                                                              test-suite :: MicroBenchmarks/LCALS/SubsetCRawLoops/lcalsCRaw.test:BM_ADI_RAW/5001     70.24     61.09 -13.0%
                                                             test-suite :: MicroBenchmarks/LCALS/SubsetCRawLoops/lcalsCRaw.test:BM_ICCG_RAW/5001      4.24      3.69 -13.0%
                                                test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<8, GreaterThanZero, Last>    520.59    452.68 -13.0%
                                             test-suite :: MicroBenchmarks/LCALS/SubsetCLambdaLoops/lcalsCLambda.test:BM_INT_PREDICT_LAMBDA/5001     14.19     12.25 -13.6%
                                                                                test-suite :: MultiSource/Benchmarks/VersaBench/8b10b/8b10b.test      2.60      2.24 -13.9%
                                                test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<8, GreaterThanZero, None>    526.07    452.15 -14.1%
                                                        test-suite :: MicroBenchmarks/LCALS/SubsetCRawLoops/lcalsCRaw.test:BM_HYDRO_1D_RAW/44217     16.04     13.75 -14.2%
                                                                          test-suite :: SingleSource/Benchmarks/Misc-C++/stepanov_container.test      2.01      1.72 -14.3%
                                                   test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<8, LessThanZero, Last>    467.09    399.49 -14.5%
                                               test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<31, GreaterThanZero, None>    379.01    323.97 -14.5%
                                               test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<16, GreaterThanZero, Last>    396.78    339.04 -14.6%
                                                                       test-suite :: MicroBenchmarks/harris/harris.test:BENCHMARK_HARRIS/512/512   2934.50   2503.31 -14.7%
                                                  test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<16, LessThanZero, None>    311.48    265.36 -14.8%
                                                   test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<8, LessThanZero, None>    468.74    399.25 -14.8%
                                              test-suite :: MicroBenchmarks/LCALS/SubsetCLambdaLoops/lcalsCLambda.test:BM_MAT_X_MAT_LAMBDA/44217 311084.78 264919.42 -14.8%
                                                                               test-suite :: SingleSource/Benchmarks/Shootout/Shootout-hash.test      1.91      1.63 -14.9%
                                                    test-suite :: MicroBenchmarks/LCALS/SubsetCRawLoops/lcalsCRaw.test:BM_TRIDIAG_ELIM_RAW/44217     88.73     75.50 -14.9%
                                                             test-suite :: MicroBenchmarks/LCALS/SubsetCRawLoops/lcalsCRaw.test:BM_ADI_RAW/44217    664.14    564.83 -15.0%
                                                        test-suite :: MicroBenchmarks/LCALS/SubsetCRawLoops/lcalsCRaw.test:BM_MAT_X_MAT_RAW/5001   5187.48   4407.79 -15.0%
                                                          test-suite :: MicroBenchmarks/LCALS/SubsetCRawLoops/lcalsCRaw.test:BM_PIC_2D_RAW/44217    269.38    228.85 -15.0%
                                                      test-suite :: MicroBenchmarks/LCALS/SubsetCRawLoops/lcalsCRaw.test:BM_INT_PREDICT_RAW/5001     14.63     12.41 -15.2%
                                               test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<32, GreaterThanZero, Last>    254.25    215.35 -15.3%
                                                   test-suite :: MicroBenchmarks/LCALS/SubsetCLambdaLoops/lcalsCLambda.test:BM_ICCG_LAMBDA/44217     40.86     34.55 -15.4%
                                              test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<31, GreaterThanZero, First>    369.11    311.37 -15.6%
                                                            test-suite :: MicroBenchmarks/LCALS/SubsetCRawLoops/lcalsCRaw.test:BM_ICCG_RAW/44217     41.41     34.92 -15.7%
                                                              test-suite :: MicroBenchmarks/LCALS/SubsetCRawLoops/lcalsCRaw.test:BM_EOS_RAW/5001      5.35      4.50 -15.8%
                                                 test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<32, LessThanZero, First>    256.73    215.68 -16.0%
                                                    test-suite :: MicroBenchmarks/LCALS/SubsetCLambdaLoops/lcalsCLambda.test:BM_ADI_LAMBDA/44217    643.09    539.12 -16.2%
                                               test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<31, GreaterThanZero, Last>    402.66    336.69 -16.4%
                                                               test-suite :: MicroBenchmarks/LCALS/SubsetCRawLoops/lcalsCRaw.test:BM_ADI_RAW/171      2.39      1.99 -16.7%
                                               test-suite :: MicroBenchmarks/LCALS/SubsetCLambdaLoops/lcalsCLambda.test:BM_HYDRO_2D_LAMBDA/44217   2830.50   2353.95 -16.8%
                                                  test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<32, LessThanZero, Last>    229.57    190.85 -16.9%
                                                        test-suite :: MicroBenchmarks/LCALS/SubsetCRawLoops/lcalsCRaw.test:BM_HYDRO_2D_RAW/44217   2844.48   2357.51 -17.1%
                                                   test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<31, LessThanZero, Mid>    363.16    300.43 -17.3%
                                                       test-suite :: MicroBenchmarks/LCALS/SubsetCRawLoops/lcalsCRaw.test:BM_MAT_X_MAT_RAW/44217 313791.67 258684.42 -17.6%
                                                 test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<31, LessThanZero, First>    364.97    300.20 -17.7%
                                              test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<32, GreaterThanZero, First>    262.98    216.18 -17.8%
                                                        test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<15, EqZero, None>    243.73    199.81 -18.0%
                                                       test-suite :: SingleSource/Benchmarks/Polybench/medley/floyd-warshall/floyd-warshall.test      1.23      1.00 -18.6%
                                               test-suite :: MicroBenchmarks/LCALS/SubsetCLambdaLoops/lcalsCLambda.test:BM_MAT_X_MAT_LAMBDA/5001   4777.76   3877.54 -18.8%
                                               test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<32, GreaterThanZero, None>    305.70    247.67 -19.0%
                                            test-suite :: MicroBenchmarks/LCALS/SubsetCLambdaLoops/lcalsCLambda.test:BM_INT_PREDICT_LAMBDA/44217    254.75    206.17 -19.1%
                             test-suite :: MicroBenchmarks/ImageProcessing/Interpolation/Interpolation.test:BENCHMARK_BILINEAR_INTERPOLATION/256   3179.03   2571.11 -19.1%
                                            test-suite :: MicroBenchmarks/LCALS/SubsetCLambdaLoops/lcalsCLambda.test:BM_DIFF_PREDICT_LAMBDA/5001     29.38     23.49 -20.0%
                                                             test-suite :: MicroBenchmarks/LCALS/SubsetCRawLoops/lcalsCRaw.test:BM_EOS_RAW/44217     50.20     40.05 -20.2%
                                               test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<8, GreaterThanZero, First>    573.39    452.40 -21.1%
                                                        test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<31, EqZero, None>    141.53    110.45 -22.0%
                                                       test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<15, EqZero, First>    256.85    200.01 -22.1%
                                                         test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<16, EqZero, Mid>    230.12    177.50 -22.9%
                                                    test-suite :: MicroBenchmarks/LCALS/SubsetCRawLoops/lcalsCRaw.test:BM_DIFF_PREDICT_RAW/44217    811.13    622.96 -23.2%
                                                   test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<16, LessThanZero, Mid>    444.73    336.54 -24.3%
                                           test-suite :: MicroBenchmarks/LCALS/SubsetCLambdaLoops/lcalsCLambda.test:BM_DIFF_PREDICT_LAMBDA/44217    829.08    626.51 -24.4%
                                                                                  test-suite :: SingleSource/Benchmarks/CoyoteBench/lpbench.test      2.16      1.63 -24.6%
                                                 test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<16, LessThanZero, First>    311.68    229.75 -26.3%
                                                             test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/3mm/3mm.test      9.96      7.13 -28.4%
                                                        test-suite :: MicroBenchmarks/MemFunctions/MemFunctions.test:BM_MemCmp<16, EqZero, Last>    254.20    176.09 -30.7%
                             test-suite :: MicroBenchmarks/ImageProcessing/Interpolation/Interpolation.test:BENCHMARK_BILINEAR_INTERPOLATION/128    926.34    639.91 -30.9%
                                                           test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/gemm/gemm.test      7.14      4.72 -33.8%
                                                             test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/2mm/2mm.test      7.45      4.72 -36.6%
                                                                                                                              Geomean difference                      -7.4%
            results1        results        diff
count  575.000000     570.000000     570.000000
mean   4724.011123    4412.941032   -0.072437  
std    44018.331108   41317.927335   0.051249  
min    0.602700       0.608313      -0.366163  
25%    2.803722       2.776350      -0.091084  
50%    95.772044      98.814336     -0.060941  
75%    408.363042     391.209962    -0.044793  
max    811740.278708  770247.007718  0.208757

Numbers look great.

Maybe you could take a look at one big regression in MicroBenchmarks/LCALS/SubsetALambdaLoops/lcalsALambda.test:BM_PRESSURE_CALC_LAMBDA/5001 ?

Maybe you could take a look at one big regression in MicroBenchmarks/LCALS/SubsetALambdaLoops/lcalsALambda.test:BM_PRESSURE_CALC_LAMBDA/5001

I had a look, I have already mentioned above that many of Microbenchmarks are very unstable. I ran several times for bothe versions and all results are in the same range for both cases. I compared binaries generated and they are the same for the binary that contains this microbenchmark.

So I also need to mention that some improvements also can be the result of this unstability of benchmarks.

Maybe you could take a look at one big regression in MicroBenchmarks/LCALS/SubsetALambdaLoops/lcalsALambda.test:BM_PRESSURE_CALC_LAMBDA/5001

I had a look, I have already mentioned above that many of Microbenchmarks are very unstable. I ran several times for bothe versions and all results are in the same range for both cases. I compared binaries generated and they are the same for the binary that contains this microbenchmark.

So I also need to mention that some improvements also can be the result of this unstability of benchmarks.

Ah, right. Thanks for info.

lebedev.ri resigned from this revision.Jan 12 2023, 5:31 PM

This review may be stuck/dead, consider abandoning if no longer relevant.
Removing myself as reviewer in attempt to clean dashboard.