This is an archive of the discontinued LLVM Phabricator instance.

[dsymutil] Perform analyzeContextInfo and CloneDIEs in parallel
ClosedPublic

Authored by JDevlieghere on Mar 1 2018, 7:55 AM.

Details

Summary

This patch makes dsymutil perform analyzeContextInfo and CloneDIEs in
parallel. For the same object file, there is a dependency between the two.
However, we can do analyzeContextInfo for the next object file while cloning
DIEs for the current. This is exactly the approach taken in this patch.

For WebCore, this leads to a performance improvement of 29% and for clang we
see similar results with at 32% improvement.

A large part of the change is related to passing around objects during linking,
instead of having one "current" counterpart accessible by all the linker's
methods. We group these objects into a new struct called LinkContext which
exists for every object file.

A big thanks to Pete Cooper (@pete) who came up with the idea and a PoC.

---------------------------------------------
WebCore (before)
---------------------------------------------
78.13 real        72.85 user         4.80 sys
77.92 real        72.75 user         4.87 sys
77.91 real        72.69 user         4.91 sys
78.38 real        73.01 user         5.00 sys
79.48 real        73.99 user         5.03 sys
---------------------------------------------
78.36 real
---------------------------------------------

---------------------------------------------
WebCore (after)
---------------------------------------------
55.22 real        74.76 user         6.16 sys
54.22 real        73.72 user         5.89 sys
55.13 real        74.62 user         6.02 sys
55.83 real        74.89 user         6.49 sys
54.87 real        74.23 user         6.10 sys
---------------------------------------------
55.05 real
---------------------------------------------

---------------------------------------------
Clang (before)
---------------------------------------------
82.16 real        78.16 user         3.61 sys
82.92 real        78.79 user         3.69 sys
84.59 real        79.85 user         3.95 sys
84.45 real        79.80 user         3.93 sys
85.11 real        80.23 user         4.07 sys
---------------------------------------------
83.85 real
---------------------------------------------

---------------------------------------------
Clang (after)
---------------------------------------------
56.28 real        81.32 user         6.53 sys
56.04 real        81.24 user         6.39 sys
56.65 real        81.42 user         6.36 sys
57.06 real        81.75 user         6.89 sys
55.65 real        80.40 user         6.59 sys
---------------------------------------------
56.34 real
---------------------------------------------

Diff Detail

Event Timeline

JDevlieghere created this revision.Mar 1 2018, 7:55 AM
JDevlieghere retitled this revision from [dsymutil] Perform analyzing and cloning DIE in parallel to [dsymutil] Perform analyzeContextInfo and CloneDIEs in parallel.Mar 1 2018, 8:37 AM
friss added a comment.Mar 1 2018, 8:44 AM

The results looks great! I haven't looked at the patches themselves, but I suppose this patch should have no impact on the output itself. Did you check if your resulting linked DWARF is exactly identical to the one before the patch?

  • Added default argument for Verbose
  • Added comment explaining the need for strong types.

The results looks great! I haven't looked at the patches themselves, but I suppose this patch should have no impact on the output itself. Did you check if your resulting linked DWARF is exactly identical to the one before the patch?

Yes, I compared the MD5 hashes for clang and they were identical.

  • Full context diff
aprantl accepted this revision.Mar 2 2018, 9:46 AM

As far as I can tell this looks very good, thanks! Fred may have more to say about the algorithmic side.

tools/dsymutil/DwarfLinker.cpp
1421

maybe use a struct instead of a pair for better readability? Up to you.

2317

why does this string need to start with a space?

4111

could skip the braces

This revision is now accepted and ready to land.Mar 2 2018, 9:46 AM
  • Feedback Adrian
JDevlieghere marked 3 inline comments as done.Mar 2 2018, 3:15 PM
friss added a comment.Mar 12 2018, 5:30 PM

It would be nice to land this in 2 commits, one NFC that does the refactoring up to creating the LinkContext and the other one that introduces the parallel processing. Otherwise, LGTM

friss accepted this revision.Mar 12 2018, 5:30 PM

It would be nice to land this in 2 commits, one NFC that does the refactoring up to creating the LinkContext and the other one that introduces the parallel processing. Otherwise, LGTM

Landed LinkContext in rL327382.

This revision was automatically updated to reflect the committed changes.