miniGMG is a compact benchmark for understanding the performance challenges associated with geometric multigrid solvers found in applications built from AMR MG frameworks like CHOMBO or BoxLib when running on modern multi- and manycore-based supercomputers. It includes both productive reference examples as well as highly-optimized implementations for CPUs and GPUs. It is sufficiently general that it has been used to evaluate a broad range of research topics including PGAS programming models and algorithmic tradeoffs inherit in multigrid. miniGMG was developed under the CACHE Joint Math-CS Institute.
On Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz:
compile_time: 22.3845 exec_time: 7.9860 Maximum resident set size (kbytes): 1012464
This is going to work only on x86. Given that we're not going to time things in the test suite like this, you might just do something like:
Clang also has a nice builtin that we can use instead of the inline assembly. We could add that as well in case anyone would like to enable timing: