This is an archive of the discontinued LLVM Phabricator instance.

[llvm-exegesis] Analysis: writeMeasurementValue(): don't alloc string for double each time.
ClosedPublic

Authored by lebedev.ri on Nov 10 2018, 1:57 PM.

Details

Summary

Test data: 500kLOC of benchmark.yaml, 23Mb. (that is a subset of the actual uops benchmark i was trying to analyze!)
Old time: (D54382)

 Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html' (16 runs):

       9024.354355      task-clock (msec)         #    1.000 CPUs utilized            ( +-  0.18% )
...
            9.0262 +- 0.0161 seconds time elapsed  ( +-  0.18% )

New time:

 Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html' (16 runs):

       8996.541057      task-clock (msec)         #    0.999 CPUs utilized            ( +-  0.19% )
...
            9.0045 +- 0.0172 seconds time elapsed  ( +-  0.19% )

-~0.3%, not that much. But this isn't the important part.

Old:

  • calls to allocation functions: 2109712
  • temporary allocations: 33112
  • bytes allocated in total (ignoring deallocations): 4.43 GB

New:

  • calls to allocation functions: 2095345 (-0.68%)
  • temporary allocations: 18745 (-43.39% !!!)
  • bytes allocated in total (ignoring deallocations): 4.31 GB (-2.71%)

Diff Detail

Repository
rL LLVM

Event Timeline

lebedev.ri created this revision.Nov 10 2018, 1:57 PM

I'm not entirely convinced here:

  • The code is harder to read,
  • the formatting pattern and the SmallString size may go out of sync,
  • the performance win is not that big.
tools/llvm-exegesis/lib/Analysis.cpp
101

Why 1 + ...?

I'm not entirely convinced here:

  • The code is harder to read,

I can totally bloat it will some comments/variables.

  • the formatting pattern and the SmallString size may go out of sync,

*May*.
Or may be removed completely, and this change was be pointless.
Or it may stay like this forever.
Who knows?
Right now this is a clear win from perf/memory standpoint.

  • the performance win is not that big.

Given the total perf improvement this patchset provides (or rather, the starting point) i will pretend i did not notice this point :)

tools/llvm-exegesis/lib/Analysis.cpp
101

You will need at most max_digits10 digits PLUS the decimal separator (.) to print the double.

lebedev.ri marked 2 inline comments as done.
lebedev.ri edited the summary of this revision. (Show Details)
  • Bloated the code with temporary variables to explain everything.
  • Provide actual benchmark.
courbet accepted this revision.Nov 19 2018, 2:46 AM
This revision is now accepted and ready to land.Nov 19 2018, 2:46 AM

@courbet thank you for the review!