Google benchmark library 1.1.0.
Add initial XRay benchmarks (by default not run, you must explicitly include MicroBenchmarks in TEST_SUITE_SUBDIRS)
Details
Diff Detail
- Build Status
Buildable 7712 Build 7712: arc lint + arc unit
Event Timeline
Sorry for the delay @eizan -- some thoughts on the CMake config.
MultiSource/Benchmarks/CMakeLists.txt | ||
---|---|---|
22 ↗ | (On Diff #95895) | We really only ought to include XRay if we can actually run the benchmarks. I'd say we can only do this for x86_64 and only if we can actually build the XRay benchmarks with XRay instrumentation. |
MultiSource/Benchmarks/XRay/CMakeLists.txt | ||
1 ↗ | (On Diff #95895) | I think we need to detect first whether the compiler supports the -fxray-instrument flag in some sort of config or check. Something similar to the cmake/config-ix.cmake stuff in compiler-rt that detects whether the compiler can support the -fxray-instrument flag. |
11 ↗ | (On Diff #95895) | Is there a way to add a dependency to the benchmark library instead, through some indirection -- maybe the benchmark lib exposes a variable that defines where the static library is? Also, you probably want to add the following instead of pthread directly: ${CMAKE_THREAD_LIBS_INIT} |
Could you explain the rationale for adding libs/benchmarks-1.1.0?
Also, since that seems to have a different license, you probably need to add that directory to the top-level LICENSE.TXT file as one of the sub-directories with "additional or alternate copyrights,licenses, and/or restrictions".
MultiSource/Benchmarks/XRay/CMakeLists.txt | ||
---|---|---|
6–7 ↗ | (On Diff #95895) | It looks weird to me that this CMakeFile explicitly set different targets for -O0, -O1, -O2, -O3. I don't think other programs in the test-suite do this. |
I mentioned this a while back in the mailing list (http://lists.llvm.org/pipermail/llvm-dev/2016-September/104392.html) and it was thought that adding google benchmarks to test-suite seemed like it would have been fine. We've written the micro-benchmark to use a benchmarking library that was widely available and easy to use (and familiar to me). There seemed to be very little opposition to adding them to the test-suite at the time -- based on the discussion in the thread.
Also, since that seems to have a different license, you probably need to add that directory to the top-level LICENSE.TXT file as one of the sub-directories with "additional or alternate copyrights,licenses, and/or restrictions".
Good point, I agree 100%.
MultiSource/Benchmarks/XRay/CMakeLists.txt | ||
---|---|---|
6–7 ↗ | (On Diff #95895) | Now that you mention it, this does seem weird. This is mostly a faithful reproduction of my attempts at doing this with a shell script. @eizan has graciously volunteered to port this into something that can be part of the LLVM test suite. If it would be more acceptable to just have a single target, then that's something we ought to change this to. |
- I think MultiSource/xxx is the wrong choice here. Things in that directory all produce some stdout/stderr output that is then compared against the reference_output file.
- Todays lit time measurements are performed "from the outside" i.e. we time how long the process takes. My understanding of google benchmarks is that it designed as a microbenchmarking library that runs a small benchmark a varying number of times until certain confidence levels are reached. This means the process as a whole will have different runtimes depending on how fast confidence levels are reached and the numbers measured are probably not helpful because of that. (From skimming through google test documentation, the number of iterations varies and the number of repetitions (--benchmark_repetitions=10) is one level higher, yes?
Don't get me wrong: I welcome the addition of a microbenchmarking library to the llvm test-suite and I can already think of a few other uses/benchmarks that shouuld be converted to it. But we should get it right:
- IMO this should go into a custom subdirectory as it is a fundamentally different way to run benchmarks. Maybe /Micro.
- This should have a new strategy for running the benchmarks and extracting the resulting numbers. So it should have an extension in test-suite/litsupport.
- As long as this isn't worked out we could also keep it in a separate repository: The test-suite is modular nowadays: The test-suite cmake searches for subdirectories containining CMakeLists.txt and includes those automatically. So using an extra repository is as easy as checking it out into the test-suite directory to have it picked up.
MultiSource/Benchmarks/XRay/CMakeLists.txt | ||
---|---|---|
6–7 ↗ | (On Diff #95895) | I agree, we rather run the benchmark suite multiple times instead of hardcoding optimization flags. (So we can also test -Os, -Oz, -Og, -mllvm -enable-my-cool-optimization or whatever someone cares about) |
libs/CMakeLists.txt | ||
3–9 | Using ExternalProject is always a bit problematic as it is not clear how flags get propagated from the top to the subprojects. Libraries are not part of the target/namespace (see above where you use the full path to libbenchmark.a) etc. This should rather be rewritten to build the google benchmark library as a part of the main test-suite build. | |
lit.cfg | ||
15 ↗ | (On Diff #95895) | This is a bad fit here as it will be passed to all benchmarks and on the other hand hard to discover when you just look at the xray benchmarks files. You should rather pass this specifically to the xray benchmarks only. |
Hello, sorry for taking so long to make the suggested changes.
- add libs as subdir to test-suite and mark as not a test subdir
- Convert google benchmark dir from exteral project to plain subdirectory
- make XRay benchmark single target, use new benchmark location
- move XRay dir from Multisource to MicroBenchmarks
- only make MicroBenchmarks/XRay available on x86
- Only provide MicroBenchmarks/XRay if x86 and support -fxray-instrument
- move XRAY_OPTIONS env setting from toplevel to local lit.cfg
- Add Benchmark entry to LICENSE.TXT
Can you please summarize the state of this and the eventual goal? It looks like:
- Eventually, we need a new LNT plugin (i.e. something like what we have in LNTBased/*) that understands how to interpret the output of the Google benchmark library.
- Right now we don't have that, so you're just fixing the number of iterations to make the tests have meaningful overall times.
Is that correct. Am I missing anything?
I'm happy to see this happening; I'd also like to convert TSVC to use the Google benchmark library and add other benchmarks of similar type in a similar fashion.
I hope this could all be done within the test-suite, probably extending test-suite/litsupport. I hope/think that no changes to LNT will be needed, but I haven't investigated myself.
For example, I'm thinking that test-suite/litsupport could end up reporting the execution time of a single iteration of the code benchmarked with the Google Benchmark library, and that will map neatly to concepts already understood by LNT.
I hope this could all be done within the test-suite, probably extending test-suite/litsupport. I hope/think that no changes to LNT will be needed, but I haven't investigated myself.
Just to be clear, LNTBased/* is in the test suite; LNT supports plugins and those can be in the test suite. That having been said, if we can do this just within litsupport, that potentially seems better. The key feature seems to be being able to report multiple benchmark timings per actual executable.
For example, I'm thinking that test-suite/litsupport could end up reporting the execution time of a single iteration of the code benchmarked with the Google Benchmark library, and that will map neatly to concepts already understood by LNT.
- We should really add a litsupport/ plugin rather than starting an LNTBased suite. I see no advantage of using LNTBased, but at the same time writing a new runner from scratch means you have to reimplement (or not implement) codesize, compiletime measurements, pgo data collection feeding back into the build etc.
CMakeLists.txt | ||
---|---|---|
155 | Why do you unconditionally include MicroBenchmarks instead of using the existing TEST_SUITE_SUBDIRS mechanism? | |
MicroBenchmarks/CMakeLists.txt | ||
2–5 | It seems like the x86 and xray checks should rather go into the xray directory. I can easily imagine other microbenchmarks getting added that do not depend on those flags. | |
MicroBenchmarks/XRay/CMakeLists.txt | ||
9–10 | Please don't add commented code. | |
MicroBenchmarks/XRay/lit.local.cfg | ||
2 | Don't you need some file(COPY) in CMakeLists.txt that copies this file into the builddir? |
- The lit data format currently does not allow multiple values for a metric.
- I'm not sure I see why that would be needed here.
- It is easy to do some form of aggregation (like taking the minimum) in the litsupport code to reduce the result to a single number. And doesn't googletest already reduce the result to a single number?
- It shouldn't be too hard to extend lit to support other data formats (it's just passing along those metrics so there is little need to restrict them).
My comment from last time still stands:
- Short of having a lit plugin that extracts the numbers reported by googletest we are currently measuring the time that the executable runs.
- My impression was that this time varies because googletest will do variable numbers of iterations. My understanding is that --benchmark_repetitions does not fix that.
- Without stable numbers coming out of this, LNT will report regressions that are not real and cannot be fixed.
I don't mean multiple times for each benchmark, I mean that there are multiple benchmarks for which we should individually collect timings. In the benchmark added here, for example, we'll get individual timings out for:
BENCHMARK(BM_ReturnInstrumentedUnPatched); BENCHMARK(BM_ReturnInstrumentedPatchedThenUnpatched); BENCHMARK(BM_ReturnInstrumentedPatched); ...
and we should separately record them all.
- It is easy to do some form of aggregation (like taking the minimum) in the litsupport code to reduce the result to a single number. And doesn't googletest already reduce the result to a single number?
- It shouldn't be too hard to extend lit to support other data formats (it's just passing along those metrics so there is little need to restrict them).
Ah right.
- Todays lit works best if you have 1 file for each test; Also test discover is done before any test is run.
- If there is a flag to only run a specific benchmark in a googletest executable then we could work around this limitation on the cmake side:
Just generate multiple .test files with slightly different RUN: lines (assume that option is called --only-test):
retref-bench-unpatched.test contains RUN: retref-bench --only-test unpatched retref-bench-patched.test contains RUN: retref-bench --only-test patches ...
It would require us to repeat the sub-benchmark names in the cmake file (or write cmake logic to scan the sourcecode?)
I suppose we could do this, but realistically, we need to be able to collect multiple named timings per test. We need some way of telling lit, "don't use the timing from this executable, it provides timing numbers some other way", and then some mechanism from parsing the output to collect those numbers. It seems fairly straightforward to do this as an LNT plugin because the benchmark library has its own well-defined output format, which is why I mentioned that, but could we build something into lit as well.
- Remove stale commented out code from MicroBenchmarks/XRay/CMakeLists.txt
- Add missing file(COPY...) command for XRay lit.local.cfg
- Move XRay-related system introspection into XRay CMakeLists.txt
- require ENABLE_XRAY_MICROBENCHMARKS=1 to run XRay Microbenchmarks
- don't explicitly add MicroBenchmarks to test suite list
It might sound weird, but I agree with both of you. :)
On the one hand, we shouldn't need to introduce too much change all at once -- right now I'm happy to just get the benchmarks there, and have them be runnable by people who want to test this on different platforms. While they might not look like the rest of the benchmarks/tests in the test-suite already, I'd be very happy to iterate on it doing both of @MatzeB and @hfinkel's suggestions. More concretely:
- In the interim, create multiple .test files each running one of the named benchmarks in the single .cc file.
- In the future teach LNT to understand the Google Benchmark output.
- Once LNT knows how to interpret the data and we can set the running of the tests such that we can reduce variance (there's a mode for Google Benchmark that only shows mean times and standard deviation by providing time bounds and iteration bounds) then we can sanitise the multiple files per benchmark.
Now for this particular change, is the current state and the plan to iterate seem reasonable to either of you (@MatzeB and @hfinkel)?
CMakeLists.txt | ||
---|---|---|
187–189 | Make sure you rebase to ToT! This part looks slightly different since a few weeks ago. | |
MicroBenchmarks/XRay/CMakeLists.txt | ||
1 | This has to be called TEST_SUITE_ENABLE_XRAY_MICROBENCHMARKS or similar. The TEST_SUITE_ prefix is necessary for people who run the test-suite as part of a normal llvm build (I think there is still some people left doing that). Without the prefix the option will not end up being set for the test-suite in the combined build. Alternatively you could just not add an option and exclude MicroBenchmarks from the default subdirecty list (same way you already excluded the libs directory). People can then use |
Rebasing off of latest head...
- add libs as subdir to test-suite and mark as not a test subdir
- Convert google benchmark dir from exteral project to plain subdirectory
- make XRay benchmark single target, use new benchmark location
- move XRay dir from Multisource to MicroBenchmarks
- only make MicroBenchmarks/XRay available on x86
- Only provide MicroBenchmarks/XRay if x86 and support -fxray-instrument
- move XRAY_OPTIONS env setting from toplevel to local lit.cfg
- Add Benchmark entry to LICENCE.TXT
- Remove stale commented out code from MicroBenchmarks/XRay/CMakeLists.txt
- Add missing file(COPY...) command for XRay lit.local.cfg
- Move XRay-related system introspection into XRay CMakeLists.txt
- require ENABLE_XRAY_MICROBENCHMARKS=1 to run XRay Microbenchmarks
- don't explicitly add MicroBenchmarks to test suite list
- rename ENABLE_XRAY_MICROBENCHMARKS -> TEST_SUITE_ENABLE...
Sure we don't *need* to implement the running strategy in this patch, I'm fine with pushing it like this. Just don't enable this by default until we have a proper way to collect the google benchmark data.
- Actually I have one more request: Could you move libs/benchmark-1.1.0 to MicroBenchmarks/benchmark-1.1.0 (or MicroBenchmarks/libs/benchmark-1.10 if you prefer that). The library should only be used by microbenchmarks and that way people that do things like just running CTMark don't have to compile the library. It also saves you from modifying the toplevel cmake file and adding that exception for libs to the TEST_SUITE_SUBDIR list construction.
- Move libs/benchmark-1.0.0 into MicroBenchmarks
- Guard building of all microbenchmarks based on ...ENABLE_XRAY_MICROBENCHMARKS
- Rename ...ENABLE_XRAY_MICROBENCHMARKS -> ...ENABLE_MICROBENCHMARKS
A few notes/questions that I'd like to bring to attention:
- I've interpreted the suggestion to be very careful to not build benchmark-1.1.0 unless the XRay benchmarks that depend on it are built. Unless I guard the add_directory(libs) with a conditional in MicroBenchmarks/CMakeLists.txt, when running the full benchmark suite without ENABLE_MICROBENCHMARKS=1, the benchmark library gets built even if the XRay binaries aren't built. Is there a better way?
- I had to hack in add_dependencies(benchmark, timeit-host), otherwise the benchmkark build would fail due to it being built before timeit. Is there a better way?
- I'm somewhat flippantly presuming that all the targets in MicroBenchmarks will depend on the benchmark-1.1.0 lib. Is this okay?
See comments below but nothing blocking. LGTM.
I don't think we need TEST_SUITE_ENABLE_MICROBENCHMARKS there is already TEST_SUITE_SUBDIRS. Just change the toplevel CMakeLists.txt to read like this:
... # Exclude tools, CTMark and MicroBenchmarks from default list if(NOT ${subdir} STREQUAL tools AND NOT ${subdir} STREQUAL CTMark AND NOT ${subdir} STREQUAL MicroBenchmarks) list(APPEND TEST_SUITE_SUBDIRS ${subdir}) endif() ...
You would then use -DTEST_SUITE_SUBDIRS="MicroBenchmarks" or -DTEST_SUITE_SUBDIRS="SingleSource;MultiSource;MicroBenchmarks" to enable it so we don't need a separate TEST_SUITE_ENABLE_MICROBENCHMARKS flag.
A few notes/questions that I'd like to bring to attention:
- I've interpreted the suggestion to be very careful to not build benchmark-1.1.0 unless the XRay benchmarks that depend on it are built. Unless I guard the add_directory(libs) with a conditional in MicroBenchmarks/CMakeLists.txt, when running the full benchmark suite without ENABLE_MICROBENCHMARKS=1, the benchmark library gets built even if the XRay binaries aren't built. Is there a better way?
It is fine as is. The library won't be built if the MicroBenchmarks are skipped as a whole.
- I had to hack in add_dependencies(benchmark, timeit-host), otherwise the benchmkark build would fail due to it being built before timeit. Is there a better way?
You can use test_suite_add_build_dependencies(TARGETNAME) to get some abstraction.
- I'm somewhat flippantly presuming that all the targets in MicroBenchmarks will depend on the benchmark-1.1.0 lib. Is this okay?
Seems reasonable to me.
- Remove TEST_SUITE_ENABLE_MICROBENCHMARKS
- benchmark targets properly depend on test suite build deps
Submitted on behalf of eizan@ as http://llvm.org/viewvc/llvm-project?view=revision&revision=307017
Why do you unconditionally include MicroBenchmarks instead of using the existing TEST_SUITE_SUBDIRS mechanism?