Page MenuHomePhabricator

COFF, ELF: ICF: Perform 2 rounds of relocation hash propagation.
ClosedPublic

Authored by pcc on Jan 20 2019, 2:31 PM.

Details

Summary

LLD's performance on PGO instrumented Windows binaries was still not
great even with the fix in D56955; out of the 2m41s linker runtime,
around 2 minutes were still being spent in ICF. I looked into this more
closely and discovered that the vast majority of the runtime was being
spent segregating .pdata sections with the following relocation chain:

.pdata -> identical .text -> unique PGO counter (not eligible for ICF)

This patch causes us to perform 2 rounds of relocation hash
propagation, which allows the hash for the .pdata sections to
incorporate the identifier from the PGO counter. With that, the amount
of time spent in ICF was reduced to about 2 seconds. I also found that
the same change led to a significant ICF performance improvement in a
regular release build of Chromium's chrome_child.dll, where ICF time
was reduced from around 1s to around 700ms.

With the same change applied to the ELF linker, median of 100 runs
for lld-speed-test/chrome reduced from 4.53s to 4.45s on my machine.

I also experimented with increasing the number of propagation rounds
further, but I did not observe any further significant performance
improvements linking Chromium or Firefox.

Diff Detail

Repository
rL LLVM

Event Timeline

pcc created this revision.Jan 20 2019, 2:31 PM

Is this worth picking over to 8.0 after it lands?

pcc added a comment.Jan 22 2019, 1:22 PM

Is this worth picking over to 8.0 after it lands?

I'm more ambivalent about this one since it's more of a perf improvement than a fix for a perf regression like D56955 is. But I have no strong objections to doing so.

In D56986#1366874, @pcc wrote:

Is this worth picking over to 8.0 after it lands?

I'm more ambivalent about this one since it's more of a perf improvement than a fix for a perf regression like D56955 is. But I have no strong objections to doing so.

That's fair; we can just cherry-pick it to our 8.0 branch internally if it doesn't end up merged here :) It's early in the release cycle though, so I figured it was okay for small-ish changes like this which result in big improvements to go in.

rnk accepted this revision.Jan 22 2019, 1:33 PM

lgtm, nice!

This revision is now accepted and ready to land.Jan 22 2019, 1:33 PM
In D56986#1366874, @pcc wrote:

Is this worth picking over to 8.0 after it lands?

I'm more ambivalent about this one since it's more of a perf improvement than a fix for a perf regression like D56955 is. But I have no strong objections to doing so.

That's fair; we can just cherry-pick it to our 8.0 branch internally if it doesn't end up merged here :) It's early in the release cycle though, so I figured it was okay for small-ish changes like this which result in big improvements to go in.

FWIW I was hoping to see this be merged to 8.0 as well.

pcc added a comment.Jan 22 2019, 2:10 PM
In D56986#1366874, @pcc wrote:

Is this worth picking over to 8.0 after it lands?

I'm more ambivalent about this one since it's more of a perf improvement than a fix for a perf regression like D56955 is. But I have no strong objections to doing so.

That's fair; we can just cherry-pick it to our 8.0 branch internally if it doesn't end up merged here :) It's early in the release cycle though, so I figured it was okay for small-ish changes like this which result in big improvements to go in.

FWIW I was hoping to see this be merged to 8.0 as well.

Okay, let's go for it.

This revision was automatically updated to reflect the committed changes.
pcc added a comment.Jan 22 2019, 4:02 PM

Merge requested in PR40416.