This is an archive of the discontinued LLVM Phabricator instance.

[MachineSink+PGO] Teach MachineSink to use BlockFrequencyInfo
ClosedPublic

Authored by bruno on Sep 22 2014, 12:59 PM.

Details

Summary

Teach MachineSink to use BlockFrequencyInfo

Machine Sink uses loop depth information to select between successors BBs to sink machine instructions into, where BBs within smaller loop depths are preferable.
This patch adds support for choosing between successors by using profile information from BlockFrequencyInfo instead, whenever the information is available.

Tested it under SPEC2006 train (average of 30 runs for each program). The baseline uses -O3 and PGO without Machine Sink support. There are no regressions found for these programas, and the speedup follows:

benchmark Speedup (Relative change)
400.perlbench 2.49%
401.bzip2 2.59%
403.gcc 2.43%
429.mcf 2.56%
456.hmmer 2.38%
458.sjeng 1.75%
462.libquantum 0.08%
464.h264ref 2.65%
471.omnetpp 1.00%
473.astar 1.81%
483.xalancbmk 1.00%
433.milc 1.44%
444.namd 1.60%
450.soplex 1.63%
470.lbm 1.65%
GeoMean 1.49%

More details on the results at http://pastebin.com/VQBfU4PH

Diff Detail

Event Timeline

bruno updated this revision to Diff 13946.Sep 22 2014, 12:59 PM
bruno retitled this revision from to [MachineSink+PGO] Teach MachineSink to use BlockFrequencyInfo.
bruno updated this object.
bruno edited the test plan for this revision. (Show Details)
bruno added reviewers: qcolombet, chandlerc, ributzka.
bruno set the repository for this revision to rL LLVM.
bruno added subscribers: Unknown Object (MLST), bob.wilson.
chandlerc edited edge metadata.Sep 22 2014, 1:13 PM

This seems fine in general.

Could you include a (admitedly contrived) test case that exercises the two
paths?

qcolombet edited edge metadata.Sep 22 2014, 1:18 PM

Same comment as Chandler.

Thanks Bruno,
-Quentin

bruno updated this revision to Diff 14045.Sep 24 2014, 12:00 PM
bruno edited edge metadata.

Thanks for the feedback,

Updated the patch to include a test-case.

Manman, they're collected with train data and tested on train data.
Chad, I've tested it only on darwin x86-64.

qcolombet added inline comments.Sep 24 2014, 12:20 PM
lib/CodeGen/MachineSink.cpp
260

Should we still grab this analysis if UseBlockFreqInfo is false?

588–593

If we do not grab the analysis when UseBlockFreqInfo is false, this should be adapted.

test/CodeGen/X86/sink-blockfreq.ll
2

Put -machine-sink-bfi=true on the run line. Although this is not required, I like being explicit with what we test.

bruno updated this revision to Diff 14049.Sep 24 2014, 1:18 PM

New patch version, updated after Quentin's comments

qcolombet accepted this revision.Sep 24 2014, 1:56 PM
qcolombet edited edge metadata.

Hi Bruno,

Thanks for the quick update.

LGTM.

One remark:
uint64_t LHSFreq = MBFI ? MBFI->getBlockFreq(L).getFrequency() : 0;

We could avoid the penalty of checking whether or not MBFI is available for each tested pair by setting different call backs (one that always returns 0, and one that returns the actual frequency). Those callbacks would be set once in the beginning of runOnMachineFunction, hence having MBFI tested only once. I do not think this will have a measurable impact for most cases. So, this is fine to leave that like this, moreover, if you plan to kill the option, this won’t be necessary anymore :).

Thanks,
-Quentin

This revision is now accepted and ready to land.Sep 24 2014, 1:56 PM
bruno closed this revision.Sep 25 2014, 4:33 PM

Hi Quentin,

Thanks again, I'll kill the option hopefully soon.
Committed in r218472