This is an archive of the discontinued LLVM Phabricator instance.

MergeSimilarFunctions: a code size pass to merge functions with small differences
Needs ReviewPublic

Authored by tobiasvk on Jul 6 2016, 9:44 AM.

Details

Reviewers
hiraditya
sebpop
Summary

This patch contains a version of the MergeFunctions pass that is able to merge
not just identical functions, but also functions with small differences in
their instructions to reduce code size. It does this by inserting control flow
and an additional argument in the merged function to account for the
differences.

I originally presented this work at the LLVM Dev Meeting in 2013 [1]. A more
detailed description was published in a paper at LCTES 2014 [2]. The code was
released to the community at the time, but that version has since bitrotted and
also contained a number of bugs. Meanwhile, the pass has been in production use
at QuIC for the past few years and has been actively maintained internally.

I continue to receive requests for this code every few months, so there
definitely seems to be interest for this optimization out there and I'd like to
share it with the community once again. It has now diverged quite significantly
from the upstream MergeFunctions pass, so adding it as a separate, optional
pass would probably make the most sense.

The main disadvantage - which we have not addressed since 2013 - is that the
worst-case behavior of this optimization is quadratic in the number of
functions in the module. However, thanks to hashing, this is very rarely a
problem in practice. In 3 years of production use at QuIC over large code
bases, we've seen just one source file that triggered excessive compile times
(it was auto-generated and contained several thousand nearly-identical small
functions). We even run this in (fat) LTO without problems.

Note that the patch as-is will only attempt to merge functions that were
compiled with -Os. It's best to follow it by a run of jump threading,
instcombine, and simplifycfg to clean up.

[1] http://llvm.org/devmtg/2013-11/#talk3
[2] http://dl.acm.org/citation.cfm?id=2597811

Diff Detail

Event Timeline

tobiasvk updated this revision to Diff 62893.Jul 6 2016, 9:44 AM
tobiasvk retitled this revision from to MergeSimilarFunctions: a code size pass to merge functions with small differences.
tobiasvk updated this object.
tobiasvk added a subscriber: llvm-commits.
majnemer added inline comments.
lib/Transforms/IPO/MergeSimilarFunctions.cpp
100

This is not a good suffix to use, it is not reserved (user code can have it).

I'd suggest .merged because it will demangle nicely and is impossible to get to via most normal means.

311–324

This code is a little inconsistent regarding bracing.

341

I'd use an in-class initializer.

349

Is this definition/declaration necessary?

351–355

Would it be insufficient to = default; this operator?

376–378

No need to close & reopen the namespace.

387–388

In-class initializer?

391

Remove this?

501–503

Hmm, this won't handle token type.

618–645

This looks a lot like haveSameSpecialState in Instruction.cpp, it would be best to reuse that code.

1111–1117

Could you use DeleteContainerPointers ?

1375–1383

Why not let global-opt do this?

jfb added a comment.Jul 6 2016, 2:42 PM

Could you detail how this is different from MergeFunc, and what it would take to add the new capabilities to MergeFunc instead of duplicating?

In D22051#476003, @jfb wrote:

Could you detail how this is different from MergeFunc, and what it would take to add the new capabilities to MergeFunc instead of duplicating?

The main difference is that the in-tree MergeFunctions can only merge identical functions (modulo pointer types) whereas MergeSimilarFunctions is also capable of merging sets of functions that are merely similar, i.e. have some differences in their instructions. This expands the scope of the optimization significantly.

As to whether this could be integrated into the in-tree MergeFunctions... Well, this started off as a patch to MergeFunctions back in 2013. However, the in-tree MergeFunctions has undergone significant architectural changes since then; it now uses a total ordering of functions to speed up merging. This is great if you only want to merge identical functions, but it doesn't work for merging of similar functions.

So it really depends on what your optimization goal is. If you want to eliminate duplicates quickly, the in-tree MergeFunctions is great. In our experience, however, the main benefit comes from being able to deal with those small differences here and there that arise e.g. from template instantiations or 'copy-paste-and-hack'.

In D22051#476003, @jfb wrote:

Could you detail how this is different from MergeFunc, and what it would take to add the new capabilities to MergeFunc instead of duplicating?

The main difference is that the in-tree MergeFunctions can only merge identical functions (modulo pointer types) whereas MergeSimilarFunctions is also capable of merging sets of functions that are merely similar, i.e. have some differences in their instructions. This expands the scope of the optimization significantly.

I'm not sure that's true anymore, or at least I'm pretty sure it's now much easier to add that capability to MergeFunc than it used to be. @jrkoenig fixed a bunch of issues in MergeFunc last summer, and experimented with merging similar functions by adding support for fuzzy-equivalence in MergeFunc.

As to whether this could be integrated into the in-tree MergeFunctions... Well, this started off as a patch to MergeFunctions back in 2013. However, the in-tree MergeFunctions has undergone significant architectural changes since then; it now uses a total ordering of functions to speed up merging. This is great if you only want to merge identical functions, but it doesn't work for merging of similar functions.

I think that's incorrect. The comparison functions can be tuned in MergeFunc to achieve what you want.

So it really depends on what your optimization goal is. If you want to eliminate duplicates quickly, the in-tree MergeFunctions is great. In our experience, however, the main benefit comes from being able to deal with those small differences here and there that arise e.g. from template instantiations or 'copy-paste-and-hack'.

I entirely agree with that goal, but I'd much rather see *less* code duplication. MergeFunc is a finicky piece of code, and duplicating it sounds like a pretty bad approach versus fixing it. This fork-from-2013 won't have some of the more recent fixes from MergeFunc, and we'd just end up with yet another pass to maintain.

Thanks David for the quick review - agree with most of your points, some comments added.

lib/Transforms/IPO/MergeSimilarFunctions.cpp
100

Yes, I've been thinking that too. I do vaguely recall, though, that some architectures can't handle symbol names with dots in them? Anyone have any better recollection of this? It's certainly fine for Hexagon.

618–645

Yep, except for some important differences:

  • We treat pointer-to-A, pointer-to-B, and the corresponding int type as equivalent
  • An alloca that allocates more space is 'equivalent' to one that allocates less.

MergeFunctions has its own copy of this stuff too.

1375–1383

Merge(Similar)Functions typically runs near the end of the pipeline after GlobalOpt has already run.

davide added a subscriber: davide.Nov 8 2016, 11:00 PM
appcs added a subscriber: appcs.Dec 14 2016, 11:04 AM

MergeSimilarFunctions demonstrates great improvement on Cisco software, so this is of great value to us. We have some improvements upon the base work to improve the compile time.

MergeSimilarFunctions demonstrates great improvement on Cisco software, so this is of great value to us. We have some improvements upon the base work to improve the compile time.

I'm using it too. It improves the code size for us. Can you share your improvements?
Thanks,