This is an archive of the discontinued LLVM Phabricator instance.

[X86] Use hash table in LEA optimization pass
ClosedPublic

Authored by aturetsk on Jan 21 2016, 6:18 AM.

Details

Summary

Use hash table (key is a memory operand) to store found LEA instructions to reduce compile time.

Diff Detail

Repository
rL LLVM

Event Timeline

aturetsk updated this revision to Diff 45519.Jan 21 2016, 6:18 AM
aturetsk retitled this revision from to [X86] Use hash table in LEA optimization pass.
aturetsk updated this object.
aturetsk added reviewers: qcolombet, dberlin.
aturetsk added a subscriber: llvm-commits.

As Daniel suggested in https://llvm.org/bugs/show_bug.cgi?id=25843 I added a hash table to store LEAs to search for them faster in removeRedundantLEAs and removeRedundantAddrCalc.
I used the same test from the bug as in http://reviews.llvm.org/D15692, but with -Oz option (to enable the pass fully).
Here are the results:

Without LEA pass:

none$ time ./bin/clang++ -std=c++11 -S test-oz.ll -Oz -mllvm -disable-x86-lea-opt

real    0m8.491s
user    0m8.437s
sys     0m0.052s

With old LEA pass:

none$ time ./bin/clang++ -std=c++11 -S test-oz.ll -Oz -mllvm -enable-x86-lea-opt

real    0m8.878s
user    0m8.819s
sys     0m0.051s

With new LEA pass (no need to pass -enable-lea-opt any more):

none$ time ./bin/clang++ -std=c++11 -S test-oz.ll -Oz

real    0m8.546s
user    0m8.493s
sys     0m0.051s

The difference for the current example is small, but visible. I think for the cases mentioned in the bug it will be more noticeable.

qcolombet added inline comments.Jan 21 2016, 10:39 AM
lib/Target/X86/X86OptimizeLEAs.cpp
42 ↗(On Diff #45519)

This change is unrelated and could be its own commit.

59 ↗(On Diff #45519)

You should use /// here.

83 ↗(On Diff #45519)

Explain in a comment why having two different immediate is still okay for operator==.

91 ↗(On Diff #45519)

Add comments for each field.

100 ↗(On Diff #45519)

This may be undefined behavior if size_t is a 32-bit value.
In other words, promote the computation on int64_t, then convert to the size_t type.

125 ↗(On Diff #45519)

We should have an assert on the kind of MI we get as input.
E.g., adding X86::AddrDisp may not make sense.

153 ↗(On Diff #45519)

Why not use a DenseMap?

dberlin added inline comments.Jan 21 2016, 10:59 AM
lib/Target/X86/X86OptimizeLEAs.cpp
100 ↗(On Diff #45519)

Or just use hash_code as the type :)

103 ↗(On Diff #45519)

You should just use hash_combine, which will call hash_value.

IE

hash_code Hash = hash_combine(*Key.Operands[0], *Key.Operands[1], *Key.Operands[2], *Key.Operands[3]);
if (Key.Disp->isGlobal())
  Hash = hash_combine(Hash, Key.Disp->getGlobal());

This will do all the mixing and calling of hash_value for you, and you won't need anything else in this function.

aturetsk updated this revision to Diff 46650.Feb 2 2016, 6:51 AM

Fixed the remarks.

aturetsk updated this object.Feb 2 2016, 6:51 AM
aturetsk added inline comments.Feb 2 2016, 6:56 AM
lib/Target/X86/X86OptimizeLEAs.cpp
42 ↗(On Diff #46650)

Excluded from the patch.

56 ↗(On Diff #46650)

Done.

80 ↗(On Diff #46650)

Done.

88 ↗(On Diff #46650)

Done.

97 ↗(On Diff #46650)

I used hash_code inside the function and converted it into unsigned int since that's the return type expected from getHashValue.

100 ↗(On Diff #46650)

Fixed.

122 ↗(On Diff #46650)
assert((isLEA(MI) || MI.mayLoadOrStore()) &&
         "The instruction must be a LEA, a load or a store");

Is that OK?

186 ↗(On Diff #46650)

Fixed.

dberlin edited edge metadata.Feb 3 2016, 7:46 AM

It looks like all the comments I had were fixed. Just ping quentin and make sure he's okay with it, and then LGTM.
(If we end up with further perf issues, we can look into changing it a bit more, but i'm pretty sure it's not going to take a lot of time at this point)

qcolombet accepted this revision.Feb 3 2016, 9:19 AM
qcolombet edited edge metadata.

LGTM.

Q.

This revision is now accepted and ready to land.Feb 3 2016, 9:19 AM
This revision was automatically updated to reflect the committed changes.