Minimum density for both optsize and non optsize are now options
-sparse-jump-table-density (default 10) for non optsize functions
-dense-jump-table-density (default 40) for optsize functions, which
matches the current default. This improves several benchmarks at google
at the cost of a small codesize increase. For code compiled with -Os,
the old behavior continues
Details
- Reviewers
hans
Diff Detail
Event Timeline
Some inline comments, I stopped putting ditto down because my hands got tired. :)
-eric
test/CodeGen/ARM/2011-08-25-ldmia_ret.ll | ||
---|---|---|
17 | Only one function could probably pass your command line option alternately? No preference other than it'll isolate the testcase from any other optsize changes that happen. | |
test/CodeGen/X86/switch-bt.ll | ||
17 | Ditto. | |
108 | Ditto. |
For all the optsize cases, I have now either changed the switch values or passed a density as a flag.
Patch looks good at this point, would be good to get size/performance numbers on a run of the testsuite (or something, e.g. SPEC, etc).
Thanks!
-eric
Thanks! I think this basically looks good.
Now that you're passing flags to llc instead, do we have any tests checking that the "optsize" and "minsize" function attributes have the desired effect?
We should probably have a test with functions with no attribute, optsize, and minsize that verifies the thresholds.
And as Eric said, some numbers showing how binary size (e.g. a self-hosted clang build) and perf are affected would be great.
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp | ||
---|---|---|
89 | for normal what? | |
91 | For a user who wants to tweak these flags, I'm not sure if the "sparse-" and "dense-" names are the most friendly. What would you think of calling them "-jump-table-density" and "-optsize-jump-table-density"? | |
95 | ultra nit: period at end of comment. | |
8042 | Dense jump table density is dense? :-) | |
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h | ||
308 | How about calling the new parameter MinDensity, RequiredDensity, or something like that to indicate that it's a threshold that the actual density gets compared against? |
I'll get to the suggestions and benchmarking, but wanted to report on size:
clang compiled without the change: 176406957
clang compiled with the change: 176431533
net change: 24.0 KiB a change of 0.0014%
Nice size results. Seems like it's only going to matter in times when we want the performance.
Can't wait for the numbers.
Thanks!
Key: for each run avg, median, stddev (stddev as a percent), 10th percentile.
%Change: avg, median, 10th percentile. %Change is change in runtime, so negative percent is an improvement.
Unchanged is listed first.
I'm only showing the tests with a difference of more than 2 percent.
I ran each of these benchmarks 50 times.
The only significant results are in TSVC. I'll see how many benchmarks that is and maybe run them more times.
test-suite :: MultiSource/Benchmarks/7zip/7zip-benchmark.test 7.014 7.012 0.290 (4.130%) 6.672
test-suite :: MultiSource/Benchmarks/7zip/7zip-benchmark.test 7.021 7.011 0.199 (2.835%) 6.807
% Change: -0.016 % -0.016 % 2.013 %
test-suite :: MultiSource/Benchmarks/MallocBench/cfrac/cfrac.test 0.924 0.926 0.054 (5.828%) 0.857
test-suite :: MultiSource/Benchmarks/MallocBench/cfrac/cfrac.test 0.914 0.907 0.042 (4.616%) 0.858
% Change: -1.993 % -1.993 % 0.163 %
test-suite :: MultiSource/Benchmarks/TSVC/Equivalencing-dbl/Equivalencing-dbl.test 1.976 1.964 0.109 (5.525%) 1.840
test-suite :: MultiSource/Benchmarks/TSVC/Equivalencing-dbl/Equivalencing-dbl.test 2.000 1.959 0.122 (6.075%) 1.882
% Change: -0.242 % -0.242 % 2.288 %
test-suite :: MultiSource/Benchmarks/TSVC/Equivalencing-flt/Equivalencing-flt.test 1.246 1.233 0.091 (7.321%) 1.144
test-suite :: MultiSource/Benchmarks/TSVC/Equivalencing-flt/Equivalencing-flt.test 1.239 1.219 0.066 (5.311%) 1.169
% Change: -1.172 % -1.172 % 2.211 %
test-suite :: MultiSource/Benchmarks/TSVC/Expansion-flt/Expansion-flt.test 1.745 1.751 0.096 (5.491%) 1.628
test-suite :: MultiSource/Benchmarks/TSVC/Expansion-flt/Expansion-flt.test 1.732 1.706 0.095 (5.460%) 1.630
% Change: -2.601 % -2.601 % 0.086 %
test-suite :: MultiSource/Benchmarks/TSVC/GlobalDataFlow-dbl/GlobalDataFlow-dbl.test 3.115 3.097 0.110 (3.546%) 2.982
test-suite :: MultiSource/Benchmarks/TSVC/GlobalDataFlow-dbl/GlobalDataFlow-dbl.test 3.190 3.171 0.124 (3.872%) 3.048
% Change: 2.392 % 2.392 % 2.220 %
test-suite :: MultiSource/Benchmarks/TSVC/InductionVariable-dbl/InductionVariable-dbl.test 3.691 3.673 0.177 (4.787%) 3.512
test-suite :: MultiSource/Benchmarks/TSVC/InductionVariable-dbl/InductionVariable-dbl.test 3.685 3.667 0.181 (4.925%) 3.442
% Change: -0.162 % -0.162 % -1.988 %
test-suite :: MultiSource/Benchmarks/TSVC/LinearDependence-dbl/LinearDependence-dbl.test 3.191 3.140 0.193 (6.044%) 2.978
test-suite :: MultiSource/Benchmarks/TSVC/LinearDependence-dbl/LinearDependence-dbl.test 3.233 3.236 0.159 (4.908%) 3.051
% Change: 3.070 % 3.070 % 2.478 %
test-suite :: MultiSource/Benchmarks/TSVC/LoopRerolling-dbl/LoopRerolling-dbl.test 3.557 3.524 0.166 (4.669%) 3.378
test-suite :: MultiSource/Benchmarks/TSVC/LoopRerolling-dbl/LoopRerolling-dbl.test 3.577 3.556 0.118 (3.286%) 3.458
% Change: 0.899 % 0.899 % 2.374 %
test-suite :: MultiSource/Benchmarks/TSVC/LoopRestructuring-flt/LoopRestructuring-flt.test 3.480 3.457 0.161 (4.639%) 3.292
test-suite :: MultiSource/Benchmarks/TSVC/LoopRestructuring-flt/LoopRestructuring-flt.test 3.520 3.493 0.128 (3.642%) 3.382
% Change: 1.053 % 1.053 % 2.721 %
test-suite :: MultiSource/Benchmarks/TSVC/NodeSplitting-dbl/NodeSplitting-dbl.test 3.411 3.379 0.176 (5.147%) 3.211
test-suite :: MultiSource/Benchmarks/TSVC/NodeSplitting-dbl/NodeSplitting-dbl.test 3.469 3.459 0.168 (4.856%) 3.275
% Change: 2.372 % 2.372 % 2.009 %
test-suite :: MultiSource/Benchmarks/TSVC/Packing-flt/Packing-flt.test 2.399 2.354 0.156 (6.496%) 2.228
test-suite :: MultiSource/Benchmarks/TSVC/Packing-flt/Packing-flt.test 2.415 2.415 0.099 (4.110%) 2.266
% Change: 2.594 % 2.594 % 1.705 %
test-suite :: MultiSource/Benchmarks/TSVC/Recurrences-flt/Recurrences-flt.test 3.480 3.449 0.142 (4.087%) 3.349
test-suite :: MultiSource/Benchmarks/TSVC/Recurrences-flt/Recurrences-flt.test 3.409 3.412 0.110 (3.237%) 3.266
% Change: -1.089 % -1.089 % -2.467 %
test-suite :: MultiSource/Benchmarks/TSVC/Searching-dbl/Searching-dbl.test 2.778 2.749 0.137 (4.941%) 2.632
test-suite :: MultiSource/Benchmarks/TSVC/Searching-dbl/Searching-dbl.test 2.825 2.824 0.135 (4.784%) 2.648
% Change: 2.715 % 2.715 % 0.581 %
test-suite :: MultiSource/Benchmarks/TSVC/Symbolics-flt/Symbolics-flt.test 1.002 1.002 0.077 (7.686%) 0.919
test-suite :: MultiSource/Benchmarks/TSVC/Symbolics-flt/Symbolics-flt.test 1.041 1.043 0.069 (6.591%) 0.962
% Change: 4.099 % 4.099 % 4.579 %
When I re-ran the TSVC tests 100 times on a quieter machine, all the
differences were less than 2%.
Much better, thanks!
I think a test that covers the different thresholds for optsize and regular functions is still needed.
lgtm
test/CodeGen/X86/switch-density.ll | ||
---|---|---|
75 | This one's always a jump table right, so the comment is slightly wrong? |
How about calling the new parameter MinDensity, RequiredDensity, or something like that to indicate that it's a threshold that the actual density gets compared against?