This is an archive of the discontinued LLVM Phabricator instance.

[opt-viewer] Reduce memory consumption
ClosedPublic

Authored by anemet on Jul 18 2017, 1:01 AM.

Details

Summary

The observation is that we have a lot of similar remarks with lots of
identical strings (e.g. file paths, text from the remark). Storing a copy of
each of those strings in memory is wasteful. This makes all the strings in
the remark interned which maintains a single immutable instance that is
referenced everywhere.

I get an average 20% heap size reduction with this but it's possible that this
varies with the typical length of the file paths used. (I used heapy to
report the heap size.) Runtime is same or a tiny bit better.

| # of files            |   60 |  114 |  308 |  605 | 1370 |
| # of remarks          |  20K |  37K | 146K | 180K | 640K |
| total file size (MB)  |   22 |   51 |  219 |  202 | 1034 |
|-----------------------+------+------+------+------+------|
| Heap size before (MB) |  106 |  226 |  894 |  934 | 3573 |
| Heap size after       |   86 |  179 |  694 |  739 | 2798 |
| Rate                  | 0.81 | 0.79 | 0.78 | 0.79 | 0.78 |
|-----------------------+------+------+------+------+------|
| Average remark size   | 4.30 | 4.84 | 4.75 | 4.11 | 4.37 |
| Mem2disk ratio        | 3.91 | 3.51 | 3.17 | 3.66 | 2.71 |

Diff Detail

Repository
rL LLVM

Event Timeline

anemet created this revision.Jul 18 2017, 1:01 AM

Forgot David Li from the original list.

modocache accepted this revision.Jul 19 2017, 1:46 PM

Excellent, thanks for all of the improvements!

I noticed a typo you could fix before landing.

tools/opt-viewer/optrecord.py
69 ↗(On Diff #107028)

Typo: s/handels/handles/g.

This revision is now accepted and ready to land.Jul 19 2017, 1:46 PM
davide accepted this revision.Jul 19 2017, 1:51 PM

LGTM.

This revision was automatically updated to reflect the committed changes.

Also with https://reviews.llvm.org/rL308537, opt-stats now prints a metric for memory consumption if you have guppy (a.k.a. heapy) installed. The metric is the total memory allocated divided by the number of remarks -- average in-memory remark size (~3KB currently). This is printed once all the remarks are loaded.