Page MenuHomePhabricator

[HotColdSplit] Introduce a cost model to control splitting behavior
ClosedPublic

Authored by vsk on Jan 23 2019, 3:48 PM.

Details

Summary

The main goal of the model is to avoid *increasing* function size, as
that would eradicate any memory locality benefits from splitting. This
happens when:

  • There are too many inputs or outputs to the cold region. Argument materialization and reloads of outputs have a cost.
  • The cold region has too many distinct exit blocks, causing a large switch to be formed in the caller.
  • The code size cost of the split code is less than the cost of a set-up call.

A secondary goal is to prevent excessive overall binary size growth.

With the cost model in place, I experimented to find a splitting
threshold that works well in practice. To make warm & cold code easily
separable for analysis purposes, I moved split functions to a "cold"
section. I experimented with thresholds between [0, 4] and set the
default to the threshold which minimized geomean
text size.

Experiment data from building LNT+externals for X86 (N = 639 programs,
all sizes in bytes):

Configuration__text geom size__cold geom sizeTEXT geom size
*-Os*1736.30, n=010961.6
-Os, thresh=01740.53124.482, n=13411014
-Os, thresh=11734.7957.8781, n=9010978.6
-Os, thresh=21733.8565.6604, n=6110977.6
-Os, thresh=31733.8565.3071, n=6110977.6
-Os, thresh=41735.0867.5156, n=5410965.7
*-Oz*1554.40, n=010153
-Oz, thresh=21552.265.633, n=6110176
*-O3*2563.370, n=013105.4
-O3, thresh=22559.4971.1072, n=6113162.4

Picking thresh=2 reduces the geomean __text section size by 0.14% at
-Os, -Oz, and -O3 and causes ~0.2% growth in the TEXT segment. Note that
TEXT size is page-aligned, whereas section sizes are byte-aligned.

Experiment data from building LNT+externals for ARM64 (N = 558 programs,
all sizes in bytes):

Configuration__text geom size__cold geom sizeTEXT geom size
*-Os*1763.960, n=042934.9
-Os, thresh=21760.976.6755, n=6142934.9

Picking thresh=2 reduces the geomean __text section size by 0.17% at
-Os and causes no growth in the TEXT segment.

Measurements were done with D57082 applied.

Diff Detail

Repository
rL LLVM

Event Timeline

vsk created this revision.Jan 23 2019, 3:48 PM
t.p.northover accepted this revision.Jan 25 2019, 7:06 AM
t.p.northover added a subscriber: t.p.northover.

I think this looks pretty reasonable. The penalties are a bit speculative (fairly inevitably since the instructions haven't been created yet), but look sane. If they turn out to be problematic in future we could add some hooks to customize them.

This revision is now accepted and ready to land.Jan 25 2019, 7:06 AM
This revision was automatically updated to reflect the committed changes.