This is an archive of the discontinued LLVM Phabricator instance.

Parallelize tail-merge string table construction.
Needs ReviewPublic

Authored by ruiu on Oct 3 2017, 9:24 PM.

Details

Reviewers
pcc
espindola
Summary

This patch parallelizes MergeTailSection just like I did to
MergeNoTailSection in r314588.

On my 2-socket 20-core 40-threads Xeon E5-2680 @ 2.8 GHz machine,
this patch shortens the clang debug build link time with -O2 from
11.35s to 5.72s. Without -O2, it's 5.23s, so the overhead of -O2 is
now about a half second in this test environment.

Event Timeline

ruiu created this revision.Oct 3 2017, 9:24 PM
ruiu edited the summary of this revision. (Show Details)Oct 3 2017, 9:45 PM
ruiu updated this revision to Diff 119807.Oct 22 2017, 7:06 PM
  • Embed TailHash into SectionPiece
ruiu added a comment.Oct 26 2017, 3:50 PM

Ping. What do you think of this?

grimar added a subscriber: grimar.Oct 27 2017, 3:30 AM
ruiu added a comment.Feb 8 2018, 1:36 PM

I forgot this patch, but I'd like to get this in. Can you review?

ruiu added a reviewer: pcc.Feb 8 2018, 1:36 PM
inglorion added inline comments.
lld/ELF/SyntheticSections.cpp
2988

What if you did the equivalent of:

size_t N = std::min(4, S.size());
uint32_t X = 0;
memcpy(&X, S.data() + S.size() - N, N);
return hash_value(X);

That would allow you to handle strings of any length and should be super efficient.

If you need the hash to not depend on the endianness of the host you could use ulittle32_t, but I think in this case you don't need to care what the actual values produced by the hash function are so ignoring endianness seems fine.

espindola edited reviewers, added: espindola; removed: rafael.Mar 15 2018, 9:07 AM
ruiu updated this revision to Diff 195092.Apr 14 2019, 10:04 PM
  • rebased
Herald added a project: Restricted Project. · View Herald TranscriptApr 14 2019, 10:04 PM
Herald added a subscriber: MaskRay. · View Herald Transcript

It is a bit unclear to me which test you want to change to show the effect of this patch.
I.e. I would expect either adding one new test or modifying one of the
existent, but you change the 3 tests here and it is a bit confusing IMO.

Also, since this patch really has an old base (you posted it in 2017),
do you think it is worth to update its description with fresh numbers?

lld/ELF/SyntheticSections.cpp
3001

nit: I might be mistaken, but I think here you need a comma: "Currently" -> "Currently, "

lld/ELF/SyntheticSections.h
848

nit: no full stop at the end of the comment.

ruiu added a comment.Apr 15 2019, 1:25 AM

This patch naturally changes the order how strings are placed in a merged string section. In addition to that, with this change, strings equal to or shorter than 4 bytes long are not merged. This is why.

Assume we have two string, "cd" and "abcd". They have the same tail.

Since we use a hash value of the last 4 characters of each string as a shard ID, we compute two hashes, H1=hash_vaule("cd") and H2=hash_value("abcd"). It is unlikely that H1==H2 because they are just different strings. Therefore, it is likely that "cd" and "abcd" are added to different shards, and they are not merged.

Due to the above behavior, I needed to make the strings used in the test little longer, as otherwise they wouldn't have been merged at all.

ruiu updated this revision to Diff 195102.Apr 15 2019, 1:42 AM
  • address review comments
  • simplify
ruiu marked an inline comment as done.Apr 15 2019, 1:43 AM
ruiu added inline comments.
lld/ELF/SyntheticSections.cpp
2988

After simplifying this function it became too short, so I inlined the function.

ruiu updated this revision to Diff 195104.Apr 15 2019, 1:44 AM
  • 80 columns

Thanks for the explanation! Two more comments/questions below.

lld/ELF/InputSection.h
238

Seems you could store ShardID : 5 here instead of TailHash.
That would save a few bits for OutputOff and you can get rid of computations in the code.
Does it make sense?

lld/ELF/SyntheticSections.cpp
2997

Not only when -O2, but actually -On, where n >= 2.

ruiu updated this revision to Diff 204711.Jun 14 2019, 12:10 AM
  • rebased
  • instead of storing a tail hash, directly store shard-id to SectionPieces