This is an archive of the discontinued LLVM Phabricator instance.

MergeSimilarFunctions 1/n: a code size pass to merge functions with small differences
Needs ReviewPublic

Authored by hiraditya on Oct 4 2018, 11:22 AM.

Details

Summary

Porting the merge similar functions along with bugfixes.
https://reviews.llvm.org/D22051

Not sure if I should commandeer the patch there of create a new one. I am porting this because I have subsequent patches to enable this during thin-lto (https://llvm.org/devmtg/2018-10/talk-abstracts.html#talk2). Existing Function Merging optimization is not as good as this one because this optimization can merge functions with slight differences as well.


This patch contains a version of the MergeFunctions pass that is able to merge
not just identical functions, but also functions with small differences in
their instructions to reduce code size. It does this by inserting control flow
and an additional argument in the merged function to account for the
differences.

I originally presented this work at the LLVM Dev Meeting in 2013 [1]. A more
detailed description was published in a paper at LCTES 2014 [2]. The code was
released to the community at the time, but that version has since bitrotted and
also contained a number of bugs. Meanwhile, the pass has been in production use
at QuIC for the past few years and has been actively maintained internally.

I continue to receive requests for this code every few months, so there
definitely seems to be interest for this optimization out there and I'd like to
share it with the community once again. It has now diverged quite significantly
from the upstream MergeFunctions pass, so adding it as a separate, optional
pass would probably make the most sense.

The main disadvantage - which we have not addressed since 2013 - is that the
worst-case behavior of this optimization is quadratic in the number of
functions in the module. However, thanks to hashing, this is very rarely a
problem in practice. In 3 years of production use at QuIC over large code
bases, we've seen just one source file that triggered excessive compile times
(it was auto-generated and contained several thousand nearly-identical small
functions). We even run this in (fat) LTO without problems.

Note that the patch as-is will only attempt to merge functions that were
compiled with -Os. It's best to follow it by a run of jump threading,
instcombine, and simplifycfg to clean up.

[1] http://llvm.org/devmtg/2013-11/#talk3
[2] http://dl.acm.org/citation.cfm?id=2597811

Diff Detail

Event Timeline

hiraditya created this revision.Oct 4 2018, 11:22 AM
jfb added a comment.Oct 4 2018, 11:26 AM

I don't understand: why is it desirable to have this as a separate thing compared to merge funcs? It seems better to use the same code for both, and when calling mergefuncs pass in a mode which chooses to merge exact matches, or similar matches.

Do you have up-to-date compile-time numbers (at O0 and O2), as well as size numbers?

hiraditya updated this revision to Diff 168337.Oct 4 2018, 11:27 AM

DEBUG -> LLVM_DEBUG

hiraditya retitled this revision from MergeSimilarFunctions: a code size pass to merge functions with small differences to MergeSimilarFunctions 1/n: a code size pass to merge functions with small differences.Oct 13 2018, 11:24 PM

Compiler error with CloneType

nhaehnle removed a subscriber: nhaehnle.Oct 16 2018, 2:55 AM

@hiraditya Thank you very much for the code, I'm very interested in it. I'm trying to compile this and the 2/n revision together but I'm getting compilation failures, e.g.:

include/llvm/IR/ModuleSummaryIndex.h|967 col 12| error: ‘HostSimilarFunction’ was not declared in this scope

Is some part of the code is missing or is it intentional and will be fixed in the subsequent patches?

Herald added a project: Restricted Project. · View Herald TranscriptSep 29 2019, 10:35 PM

fix clonetype tests

rjf added a subscriber: rjf.Mar 18 2020, 6:48 PM
rjf updated this revision to Diff 251556.EditedMar 19 2020, 11:35 PM

Merges D52896 with trunk

dmajor added a subscriber: dmajor.Nov 11 2020, 12:12 PM