Skip to content

Commit 50a1739

Browse files
committedMay 24, 2017
Add some tips on benchmarking.
llvm-svn: 303769
1 parent 979966f commit 50a1739

File tree

2 files changed

+88
-0
lines changed

2 files changed

+88
-0
lines changed
 

‎llvm/docs/Benchmarking.rst

+87
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,87 @@
1+
==================================
2+
Benchmarking tips
3+
==================================
4+
5+
6+
Introduction
7+
============
8+
9+
For benchmarking a patch we want to reduce all possible sources of
10+
noise as much as possible. How to do that is very OS dependent.
11+
12+
Note that low noise is required, but not sufficient. It does not
13+
exclude measurement bias. See
14+
https://www.cis.upenn.edu/~cis501/papers/producing-wrong-data.pdf for
15+
example.
16+
17+
General
18+
================================
19+
20+
* Use a high resolution timer, e.g. perf under linux.
21+
22+
* Run the benchmark multiple times to be able to recognize noise.
23+
24+
* Disable as many processes or services as possible on the target system.
25+
26+
* Disable frequency scaling, turbo boost and address space
27+
randomization (see OS specific section).
28+
29+
* Static link if the OS supports it. That avoids any variation that
30+
might be introduced by loading dynamic libraries. This can be done
31+
by passing ``-DLLVM_BUILD_STATIC=ON`` to cmake.
32+
33+
* Try to avoid storage. On some systems you can use tmpfs. Putting the
34+
program, inputs and outputs on tmpfs avoids touching a real storage
35+
system, which can have a pretty big variability.
36+
37+
To mount it (on linux and freebsd at least)::
38+
39+
mount -t tmpfs -o size=<XX>g none dir_to_mount
40+
41+
Linux
42+
=====
43+
44+
* Disable address space randomization::
45+
46+
echo 0 > /proc/sys/kernel/randomize_va_space
47+
48+
* Set scaling_governor to performance::
49+
50+
for i in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
51+
do
52+
echo performance > /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
53+
done
54+
55+
* Use https://github.com/lpechacek/cpuset to reserve cpus for just the
56+
program you are benchmarking. If using perf, leave at least 2 cores
57+
so that perf runs in one and your program in another::
58+
59+
cset shield -c N1,N2 -k on
60+
61+
This will move all threads out of N1 and N2. The ``-k on`` means
62+
that even kernel threads are moved out.
63+
64+
* Disable the SMT pair of the cpus you will use for the benchmark. The
65+
pair of cpu N can be found in
66+
``/sys/devices/system/cpu/cpuN/topology/thread_siblings_list`` and
67+
disabled with::
68+
69+
echo 0 > /sys/devices/system/cpu/cpuX/online
70+
71+
72+
* Run the program with::
73+
74+
cset shield --exec -- perf stat -r 10 <cmd>
75+
76+
This will run the command after ``--`` in the isolated cpus. The
77+
particular perf command runs the ``<cmd>`` 10 times and reports
78+
statistics.
79+
80+
With these in place you can expect perf variations of less than 0.1%.
81+
82+
Linux Intel
83+
-----------
84+
85+
* Disable turbo mode::
86+
87+
echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo

‎llvm/docs/index.rst

+1
Original file line numberDiff line numberDiff line change
@@ -90,6 +90,7 @@ representation.
9090
CodeOfConduct
9191
CompileCudaWithLLVM
9292
ReportingGuide
93+
Benchmarking
9394

9495
:doc:`GettingStarted`
9596
Discusses how to get up and running quickly with the LLVM infrastructure.

0 commit comments

Comments
 (0)
Please sign in to comment.