Porting the merge similar functions along with bugfixes.
Not sure if I should commandeer the patch there of create a new one. I am porting this because I have subsequent patches to enable this during thin-lto (https://llvm.org/devmtg/2018-10/talk-abstracts.html#talk2). Existing Function Merging optimization is not as good as this one because this optimization can merge functions with slight differences as well.
This patch contains a version of the MergeFunctions pass that is able to merge
not just identical functions, but also functions with small differences in
their instructions to reduce code size. It does this by inserting control flow
and an additional argument in the merged function to account for the
I originally presented this work at the LLVM Dev Meeting in 2013 . A more
detailed description was published in a paper at LCTES 2014 . The code was
released to the community at the time, but that version has since bitrotted and
also contained a number of bugs. Meanwhile, the pass has been in production use
at QuIC for the past few years and has been actively maintained internally.
I continue to receive requests for this code every few months, so there
definitely seems to be interest for this optimization out there and I'd like to
share it with the community once again. It has now diverged quite significantly
from the upstream MergeFunctions pass, so adding it as a separate, optional
pass would probably make the most sense.
The main disadvantage - which we have not addressed since 2013 - is that the
worst-case behavior of this optimization is quadratic in the number of
functions in the module. However, thanks to hashing, this is very rarely a
problem in practice. In 3 years of production use at QuIC over large code
bases, we've seen just one source file that triggered excessive compile times
(it was auto-generated and contained several thousand nearly-identical small
functions). We even run this in (fat) LTO without problems.
Note that the patch as-is will only attempt to merge functions that were
compiled with -Os. It's best to follow it by a run of jump threading,
instcombine, and simplifycfg to clean up.