This is an archive of the discontinued LLVM Phabricator instance.

Accelerate MergeFunctions with hashing
ClosedPublic

Authored by jrkoenig on Aug 10 2015, 5:26 PM.

Details

Summary

This patch makes the Merge Functions pass faster by calculating and comparing
a hash value which captures the essential structure of a function before
performing a full function comparison.

The hash is calculated by hashing the function signature, then walking the basic
blocks of the function in the same order as the main comparison function. The
opcode of each instruction is hashed in sequence, which means that different
functions according to the existing total order cannot have the same hash, as
the comparison requires the opcodes of the two functions to be the same order.

The hash function is a static member of the FunctionComparator class because it
is tightly coupled to the exact comparison function used. For example, functions
which are equivalent modulo a single variant callsite might be merged by a more
aggressive MergeFunctions, and the hash function would need to be insensitive to
these differences in order to exploit this.

The hashing function uses a utility class which accumulates the values into an
internal state using a standard bit-mixing function. Note that this is a different interface
than a regular hashing routine, because the values to be hashed are scattered
amongst the properties of a llvm::Function, not linear in memory. This scheme is
fast because only one word of state needs to be kept, and the mixing function is
a few instructions.

The main runOnModule function first computes the hash of each function, and only
further processes functions which do not have a unique function hash. The hash
is also used to order the sorted function set. If the hashes differ, their
values are used to order the functions, otherwise the full comparison is done.

Both of these are helpful in speeding up MergeFunctions. Together they result in
speedups of 9% for mysqld (a mostly C application with little redundancy), 46%
for libxul in Firefox, and 117% for Chromium. (These are all LTO builds.) In all
three cases, the new speed of MergeFunctions is about half that of the module
verifier, making it relatively inexpensive even for large LTO builds with
hundreds of thousands of functions. The same functions are merged, so this
change is free performance.

Diff Detail

Event Timeline

jrkoenig retitled this revision from to Accelerate MergeFunctions with hashing.
jrkoenig updated this object.
jrkoenig added reviewers: jfb, nlewycky, dschuff.
jfb edited edge metadata.Aug 10 2015, 6:21 PM
  • "mysql" typo in the description.
  • Speedup numbers are hard to understand. What's a 117% speedup?
lib/Transforms/IPO/MergeFunctions.cpp
34

s/comparision/comparison/

142

Here and else, terminate sentences with periods.

424

s/comparision/comparison/

1097

Can you just use F.args() instead?

1310

Make uint64_t a typedef from the HashAccumulator above, and use it here. This will prevent the types from diverging.

Mysqld is the name of the server binary, as opposed to the client library or
CLI binary. Timings of just the MergeFunctions pass (average of 4 runs):

BinaryBeforeAfterSpeedup
Mysqld0.212s0.193s9%
Chromium11.5s5.31s117%
libxul.so (Firefox)10.4s7.11s46%

117% speedup is a little over 2x faster.

jrkoenig edited edge metadata.

Fix typos, makes typedef for hash value, and fixes issue with signature hashing

Fixes several typos in comments, as pointed out by jfb.

Makes a new typedef in FunctionComparator, FunctionHash because the hash type is part of the comparator interface. The HashAccumulator (now HashAccumulator64) class is moved near the FunctionComparator::functionHash function because it is an implementation detail.

The types of function arguments are no longer considered in the hashing, because they did not contribute any discriminating power to the hash, and the previous implementation was inconsistent with cmpTypes (which may cause rare missed merges).

Actually use the typedef in client code.

jfb added a comment.Aug 12 2015, 8:55 AM

Overall this looks good.

lib/Transforms/IPO/MergeFunctions.cpp
1095

Can you explain how the hash function is tied to the comparison function (things that the comparison function doesn't about must not be hashed). This will be important to document when merging similar (but not quite the same) functions.

Also explain why this is a special-purpose hash function for merge funcs, and isn't generally useful for ADT/Hashing.h or std::hash.

dschuff edited edge metadata.Aug 12 2015, 9:18 AM

just a couple of nits

lib/Transforms/IPO/MergeFunctions.cpp
1100

use ULL suffix here too?

1132

The style of this file seems to be "auto &Inst"

1307–1308

Might as well make this a range-for as long as you're touching every line around it.

1309

Style of this file is "Function *F"

1333

space between for and (
also, could just be a range-for

1336

space between if ((

jrkoenig marked 3 inline comments as done.
jrkoenig edited edge metadata.

Fix style issues, change hash and test, add comments

Minor style issues are fixed.

The hash is changed to using an existing LLVM function to do the mixing.
This causes a test to fail because the functions are emitted in the
opposite order than before. (This is an issue with the test being flakey,
not the determinism of the compiler, because the order is stable for each
choice of hash function.)

Adds a comment explaining the relation between the hash and the comparison.

jrkoenig marked 8 inline comments as done.Aug 12 2015, 3:25 PM

Updated style issues, etc.

lib/Transforms/IPO/MergeFunctions.cpp
1333

This can't be a range for because the next and previous values are needed.

jfb added a comment.Aug 12 2015, 4:06 PM

Can you update the patch description, since it's now not quite correct (e.g. farmhash).

The hash is changed to using an existing LLVM function to do the mixing.
This causes a test to fail because the functions are emitted in the
opposite order than before. (This is an issue with the test being flakey,
not the determinism of the compiler, because the order is stable for each
choice of hash function.)

Can you expand on this? Which test fails, and how? Can you fix it?

jrkoenig marked an inline comment as done.Aug 12 2015, 4:20 PM

@jfb The failing test is Transforms/MergeFunc/call-and-invoke-with-ranges.ll. I have included a fix in this patch. This quick fix is to just swap the functions so the CHECK-LABEL directives pick up on the right definitions. The compile is still correct, the test is just not robust enough to see this.

jfb added a comment.Aug 12 2015, 4:56 PM

lgtm, leaving open for others to comment.

test/Transforms/MergeFunc/call-and-invoke-with-ranges.ll
79

How sensitive is mergefunc to slight changes causing reordering of functions?

It might be something worth tackling later.

Any other comments?

jfb accepted this revision.Aug 14 2015, 1:56 PM
jfb edited edge metadata.
This revision is now accepted and ready to land.Aug 14 2015, 1:56 PM
This revision was automatically updated to reflect the committed changes.
jfb added a subscriber: chapuni.Aug 17 2015, 8:27 AM
jfb added inline comments.
llvm/trunk/lib/Transforms/IPO/MergeFunctions.cpp
1319 ↗(On Diff #32208)

(for posterity) @chapuni fixed this line in r245168 to be a stable sort and ignore the pair's Function*.

chapuni added inline comments.Aug 17 2015, 8:28 PM
llvm/trunk/lib/Transforms/IPO/MergeFunctions.cpp
1311 ↗(On Diff #32208)

IMO, sorting hashes might be overkill here.

Assumptions;

  1. Most of each hash entry might have single Function. In practical code, I assume that hundreds of functions wouldn't collide.
  2. For merging, order among hashes wouldn't be matter. Each hash could be considered as individual domain.

I suggest like

DenseMap<FunctionHash,SmallVector<Function*,1>> HashedFuncs;
jrkoenig added inline comments.Aug 18 2015, 11:14 AM
llvm/trunk/lib/Transforms/IPO/MergeFunctions.cpp
1311 ↗(On Diff #32208)
  1. Most entries are single functions, yes, but the design of the hash still makes huge groups. For example, all functions of the form define f(...) { ret CONSTANT; } are in the same bucket, and there are thousands of them!
  2. The order in the function tree set is by hash, then full comparison, so the functions have to be sorted anyway. The sorting is a negligible part of the runtime.

I'm working on more extensive reworking of MergeFunctions which should split the functions into domains by hash, as you suggested. This patch was designed to be a small, self-contained start.