This is an archive of the discontinued LLVM Phabricator instance.

Codegen: Decrease minimum jump table density
ClosedPublic

Authored by iteratee on Mar 16 2016, 1:26 PM.

Details

Reviewers
hans
Summary

Minimum density for both optsize and non optsize are now options
-sparse-jump-table-density (default 10) for non optsize functions
-dense-jump-table-density (default 40) for optsize functions, which
matches the current default. This improves several benchmarks at google
at the cost of a small codesize increase. For code compiled with -Os,
the old behavior continues

Diff Detail

Event Timeline

iteratee updated this revision to Diff 50858.Mar 16 2016, 1:26 PM
iteratee retitled this revision from to Codegen: Decrease minimum jump table density.
iteratee updated this object.
iteratee set the repository for this revision to rL LLVM.
iteratee added a reviewer: hans.
iteratee added subscribers: echristo, llvm-commits, timshen.

Some inline comments, I stopped putting ditto down because my hands got tired. :)

-eric

test/CodeGen/ARM/2011-08-25-ldmia_ret.ll
17

Only one function could probably pass your command line option alternately?

No preference other than it'll isolate the testcase from any other optsize changes that happen.

test/CodeGen/X86/switch-bt.ll
17

Ditto.

108

Ditto.

iteratee updated this revision to Diff 50897.Mar 16 2016, 5:09 PM
iteratee removed rL LLVM as the repository for this revision.
iteratee marked 3 inline comments as done.

For all the optsize cases, I have now either changed the switch values or passed a density as a flag.

Patch looks good at this point, would be good to get size/performance numbers on a run of the testsuite (or something, e.g. SPEC, etc).

Thanks!

-eric

hans edited edge metadata.Mar 17 2016, 2:30 AM

Thanks! I think this basically looks good.

Now that you're passing flags to llc instead, do we have any tests checking that the "optsize" and "minsize" function attributes have the desired effect?

We should probably have a test with functions with no attribute, optsize, and minsize that verifies the thresholds.

And as Eric said, some numbers showing how binary size (e.g. a self-hosted clang build) and perf are affected would be great.

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
89

for normal what?

91

For a user who wants to tweak these flags, I'm not sure if the "sparse-" and "dense-" names are the most friendly.

What would you think of calling them "-jump-table-density" and "-optsize-jump-table-density"?

95

ultra nit: period at end of comment.

8042

Dense jump table density is dense? :-)
I think this variable name would come out better if the flag was renamed as suggested above.

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h
308

How about calling the new parameter MinDensity, RequiredDensity, or something like that to indicate that it's a threshold that the actual density gets compared against?

I'll get to the suggestions and benchmarking, but wanted to report on size:
clang compiled without the change: 176406957
clang compiled with the change: 176431533

net change: 24.0 KiB a change of 0.0014%

Nice size results. Seems like it's only going to matter in times when we want the performance.

Can't wait for the numbers.

Thanks!

Key: for each run avg, median, stddev (stddev as a percent), 10th percentile.
%Change: avg, median, 10th percentile. %Change is change in runtime, so negative percent is an improvement.

Unchanged is listed first.

I'm only showing the tests with a difference of more than 2 percent.
I ran each of these benchmarks 50 times.

The only significant results are in TSVC. I'll see how many benchmarks that is and maybe run them more times.

test-suite :: MultiSource/Benchmarks/7zip/7zip-benchmark.test 7.014 7.012 0.290 (4.130%) 6.672
test-suite :: MultiSource/Benchmarks/7zip/7zip-benchmark.test 7.021 7.011 0.199 (2.835%) 6.807
% Change: -0.016 % -0.016 % 2.013 %
test-suite :: MultiSource/Benchmarks/MallocBench/cfrac/cfrac.test 0.924 0.926 0.054 (5.828%) 0.857
test-suite :: MultiSource/Benchmarks/MallocBench/cfrac/cfrac.test 0.914 0.907 0.042 (4.616%) 0.858
% Change: -1.993 % -1.993 % 0.163 %
test-suite :: MultiSource/Benchmarks/TSVC/Equivalencing-dbl/Equivalencing-dbl.test 1.976 1.964 0.109 (5.525%) 1.840
test-suite :: MultiSource/Benchmarks/TSVC/Equivalencing-dbl/Equivalencing-dbl.test 2.000 1.959 0.122 (6.075%) 1.882
% Change: -0.242 % -0.242 % 2.288 %
test-suite :: MultiSource/Benchmarks/TSVC/Equivalencing-flt/Equivalencing-flt.test 1.246 1.233 0.091 (7.321%) 1.144
test-suite :: MultiSource/Benchmarks/TSVC/Equivalencing-flt/Equivalencing-flt.test 1.239 1.219 0.066 (5.311%) 1.169
% Change: -1.172 % -1.172 % 2.211 %
test-suite :: MultiSource/Benchmarks/TSVC/Expansion-flt/Expansion-flt.test 1.745 1.751 0.096 (5.491%) 1.628
test-suite :: MultiSource/Benchmarks/TSVC/Expansion-flt/Expansion-flt.test 1.732 1.706 0.095 (5.460%) 1.630
% Change: -2.601 % -2.601 % 0.086 %
test-suite :: MultiSource/Benchmarks/TSVC/GlobalDataFlow-dbl/GlobalDataFlow-dbl.test 3.115 3.097 0.110 (3.546%) 2.982
test-suite :: MultiSource/Benchmarks/TSVC/GlobalDataFlow-dbl/GlobalDataFlow-dbl.test 3.190 3.171 0.124 (3.872%) 3.048
% Change: 2.392 % 2.392 % 2.220 %
test-suite :: MultiSource/Benchmarks/TSVC/InductionVariable-dbl/InductionVariable-dbl.test 3.691 3.673 0.177 (4.787%) 3.512
test-suite :: MultiSource/Benchmarks/TSVC/InductionVariable-dbl/InductionVariable-dbl.test 3.685 3.667 0.181 (4.925%) 3.442
% Change: -0.162 % -0.162 % -1.988 %
test-suite :: MultiSource/Benchmarks/TSVC/LinearDependence-dbl/LinearDependence-dbl.test 3.191 3.140 0.193 (6.044%) 2.978
test-suite :: MultiSource/Benchmarks/TSVC/LinearDependence-dbl/LinearDependence-dbl.test 3.233 3.236 0.159 (4.908%) 3.051
% Change: 3.070 % 3.070 % 2.478 %
test-suite :: MultiSource/Benchmarks/TSVC/LoopRerolling-dbl/LoopRerolling-dbl.test 3.557 3.524 0.166 (4.669%) 3.378
test-suite :: MultiSource/Benchmarks/TSVC/LoopRerolling-dbl/LoopRerolling-dbl.test 3.577 3.556 0.118 (3.286%) 3.458
% Change: 0.899 % 0.899 % 2.374 %
test-suite :: MultiSource/Benchmarks/TSVC/LoopRestructuring-flt/LoopRestructuring-flt.test 3.480 3.457 0.161 (4.639%) 3.292
test-suite :: MultiSource/Benchmarks/TSVC/LoopRestructuring-flt/LoopRestructuring-flt.test 3.520 3.493 0.128 (3.642%) 3.382
% Change: 1.053 % 1.053 % 2.721 %
test-suite :: MultiSource/Benchmarks/TSVC/NodeSplitting-dbl/NodeSplitting-dbl.test 3.411 3.379 0.176 (5.147%) 3.211
test-suite :: MultiSource/Benchmarks/TSVC/NodeSplitting-dbl/NodeSplitting-dbl.test 3.469 3.459 0.168 (4.856%) 3.275
% Change: 2.372 % 2.372 % 2.009 %
test-suite :: MultiSource/Benchmarks/TSVC/Packing-flt/Packing-flt.test 2.399 2.354 0.156 (6.496%) 2.228
test-suite :: MultiSource/Benchmarks/TSVC/Packing-flt/Packing-flt.test 2.415 2.415 0.099 (4.110%) 2.266
% Change: 2.594 % 2.594 % 1.705 %
test-suite :: MultiSource/Benchmarks/TSVC/Recurrences-flt/Recurrences-flt.test 3.480 3.449 0.142 (4.087%) 3.349
test-suite :: MultiSource/Benchmarks/TSVC/Recurrences-flt/Recurrences-flt.test 3.409 3.412 0.110 (3.237%) 3.266
% Change: -1.089 % -1.089 % -2.467 %
test-suite :: MultiSource/Benchmarks/TSVC/Searching-dbl/Searching-dbl.test 2.778 2.749 0.137 (4.941%) 2.632
test-suite :: MultiSource/Benchmarks/TSVC/Searching-dbl/Searching-dbl.test 2.825 2.824 0.135 (4.784%) 2.648
% Change: 2.715 % 2.715 % 0.581 %
test-suite :: MultiSource/Benchmarks/TSVC/Symbolics-flt/Symbolics-flt.test 1.002 1.002 0.077 (7.686%) 0.919
test-suite :: MultiSource/Benchmarks/TSVC/Symbolics-flt/Symbolics-flt.test 1.041 1.043 0.069 (6.591%) 0.962
% Change: 4.099 % 4.099 % 4.579 %

When I re-ran the TSVC tests 100 times on a quieter machine, all the
differences were less than 2%.

iteratee updated this revision to Diff 51329.Mar 22 2016, 1:07 PM
iteratee edited edge metadata.
iteratee marked 4 inline comments as done.

Tidied up names according to comments.

iteratee marked an inline comment as done.Mar 22 2016, 1:07 PM
hans added a comment.Mar 22 2016, 1:13 PM

Tidied up names according to comments.

Much better, thanks!

I think a test that covers the different thresholds for optsize and regular functions is still needed.

iteratee updated this revision to Diff 51850.Mar 28 2016, 4:02 PM

Add a test that verifies the density switch does what it says.

hans accepted this revision.Mar 28 2016, 4:12 PM
hans edited edge metadata.

lgtm

test/CodeGen/X86/switch-density.ll
75

This one's always a jump table right, so the comment is slightly wrong?

This revision is now accepted and ready to land.Mar 28 2016, 4:12 PM
iteratee closed this revision.Apr 7 2016, 9:17 PM