As @eli.friedman suggested in https://bugs.llvm.org/show_bug.cgi?id=32384#c1, this makes the inlining of memset and memcpy more aggressive when compiling for speed. The tuning remains the same when optimizing for size.
The following experiment on A-72 shows that there are several benchmarks benefiting from this change when compiling the SPEC CPU2000 with -O3 with a low overhead on code size.
A better score is positive, an increase in text size is positive:
Benchmark | Score | Text |
---|---|---|
spec2000/164.gzip | 0.01% | -0.01% |
spec2000/175.vpr | -0.46% | 0.01% |
spec2000/176.gcc | -0.28% | 0.01% |
spec2000/177.mesa | 0.75% | 0.08% |
spec2000/179.art | 0.39% | 0.00% |
spec2000/181.mcf | 0.26% | 0.00% |
spec2000/183.equake | -0.34% | -0.01% |
spec2000/186.crafty | 0.09% | 0.06% |
spec2000/188.ammp | 2.50% | 0.01% |
spec2000/197.parser | 0.21% | 0.00% |
spec2000/252.eon | 0.50% | 1.62% |
spec2000/253.perlbmk | 1.67% | 0.00% |
spec2000/254.gap | -0.40% | 0.01% |
spec2000/255.vortex | -0.24% | 0.00% |
spec2000/256.bzip2 | 0.01% | 0.00% |
spec2000/300.twolf | 0.59% | 0.00% |