This is an archive of the discontinued LLVM Phabricator instance.

Mere SHF_STRING
ClosedPublic

Authored by rafael on Oct 21 2015, 2:15 PM.

Details

Reviewers
ruiu
Summary

This as another early patch FYI.

The main issues with this one is the use of llvm::StringTableBuilder. It assumes a specific string format and we have to hack around it. It is also fairly slow.

With this patch the size of clang goes from 87499552 to 77160296, but the link time goes from 0.527103515 to 1.072978416.

Gold produces a binary that is 64589560 bytes long in 1.099916754 seconds.

I will try to code a more efficient StringTableBuilder and optimize offset lookups.

Diff Detail

Event Timeline

rafael updated this revision to Diff 38048.Oct 21 2015, 2:15 PM
rafael retitled this revision from to Mere SHF_STRING.
rafael updated this object.
rafael added a reviewer: ruiu.
rafael added a subscriber: llvm-commits.
ruiu edited edge metadata.Oct 21 2015, 2:35 PM

Do we have a balance tree or trie in LLVM? Is there any other good data structure to find common string prefix quickly?

Adding strings to the merge section in string length order (insert longer strings first) would help?

ELF/InputSection.cpp
181

I think we really don't have to support SHF_MERGE string section whose entry size is not 1 because almost all string literals are in ASCII or in UTF-8.

emaste added a subscriber: emaste.Oct 21 2015, 4:44 PM
rafael updated this revision to Diff 38172.Oct 22 2015, 2:08 PM
rafael edited edge metadata.

Implemented some optimizations:

  • Implements in llvm (git pull)
  • Store the start point of each string in the input section.
  • Cache the output offset.

Since the string case with this also works for regular merge data, use it there too.

I am doing a new clang build with this and will post the results shortly.

I am doing a new clang build with this and will post the results shortly.

This time including --gc-sections (since it is already on master):

  • gold 77092528 bytes, 1.591897178 seconds
  • master 84991536 bytes, 0.501578306 seconds
  • patch 74723056 bytes, 0.734235534 seconds

Cheers,
Rafael

ruiu added inline comments.Oct 22 2015, 3:53 PM
ELF/InputSection.h
83

Do we need this class? This class seems to provide only this member function, getOffset(). I'd be tempted to squash the class hierarchy by removing this.

ELF/OutputSections.cpp
760–763

You can use std::all_of if you like.

ELF/OutputSections.h
255

Do you really need this cache? StringTableBuilder now provides a map from stringRef to its offset in the section, so this seems to be redundant.

rafael added inline comments.Oct 23 2015, 5:47 AM
ELF/OutputSections.h
255

Good question.

This is in MergeOutputSection, which handles non string cases, so it is currently not a cache.

It might be possible to view non string merging as a special case of string merging since strings are what take most of the time. Let me give that a try.

Do you really need this cache? StringTableBuilder now provides a map from stringRef to its offset in the section, so this seems to be redundant.

Good question.

This is in MergeOutputSection, which handles non string cases, so it is currently not a cache.

It might be possible to view non string merging as a special case of string merging since strings are what take most of the time. Let me give that a try.

Unfortunately the existing StringTableBuilder assumes null
termination, so we can't use it as a general case. I did implement a
more general solution when trying to optimize it. I will give it a
try.

On your question on needing MergeInputSectionBase: I think I need it.
It holds the Offsets vector on which we binary search.

Cheers,
Rafael

rafael updated this revision to Diff 38233.Oct 23 2015, 6:14 AM

Rebased and use std::all_of

Comment at: ELF/InputSection.cpp:166
@@ +165,3 @@
+
+static size_t findNull(StringRef S, unsigned EntSize) {

+ for (unsigned I = 0, N = S.size(); I != N; I += EntSize) {

I think we really don't have to support SHF_MERGE string section whose entry size is not 1 because almost all string literals are in ASCII or in UTF-8.

They do show up. When linking clang the sizes/times I get now are

gold: 77092512 1.584946846
master: 87482424 0.491110448
merge str 1: 77213816 0.658822729
merge all str: 77209848 0.656403298

All I had to do to get good performance when merging all strings was
add one special case. Uploading the patch.

Cheers,
Rafael

rafael updated this revision to Diff 38236.Oct 23 2015, 8:04 AM

Optimize the common case of strings of size 1.

ruiu added inline comments.Oct 23 2015, 9:35 AM
ELF/InputSection.cpp
134

Please add a comment saying that this function translates an offset in this input merged section to an offset in an output section.

142

It's nice to write a comment here to describe what this code is doing. As I understand, Offsets is a sorted list of starting offsets of all entries, so we first do binary search on that list to find the start and end positions of an entry that contains a given Offset, and then find that entry from the output section. Result is cached.

158

Why don't you use dyn_cast here?

162

and cast<>?

rafael updated this revision to Diff 38260.Oct 23 2015, 1:52 PM

This uses fewer classes and makes tail merging optional, controlled by -O2 or higher (this is the same option that gold uses).

The flip side is that this depends on a new llvm class (I will upload it to another code review).

The new numbers are

gold -O1: 78213512 bytes, 1.454525942 seconds
gold -O2: 77092512 bytes, 1.583773470 seconds
master: 87482424 bytes, 0.494621872 seconds
patch -O1: 77295864 bytes, 0.563764220 seconds
patch -O2: 77209848 bytes, 0.657637513 seconds

Hi Rafael,

In my experience, people rarely use linker optimizations that are not on by default. As I recall, nether GCC nor Clang pass -O<n> through to the linker. Is it time to change that?

Thanks again,
Hal

  • Original Message -----

From: "Rafael Ávila de Espíndola via llvm-commits" <llvm-commits@lists.llvm.org>
To: ruiu@google.com
Cc: llvm-commits@lists.llvm.org
Sent: Friday, October 23, 2015 3:52:15 PM
Subject: Re: [PATCH] D13958: Mere SHF_STRING

rafael updated this revision to Diff 38260.
rafael added a comment.

This uses fewer classes and makes tail merging optional, controlled
by -O2 or higher (this is the same option that gold uses).

The flip side is that this depends on a new llvm class (I will upload
it to another code review).

http://reviews.llvm.org/D13958

Files:

ELF/InputFiles.cpp
ELF/InputSection.cpp
ELF/InputSection.h
ELF/OutputSections.cpp
ELF/OutputSections.h
test/elf2/merge-string-align.s
test/elf2/merge-string-error.s
test/elf2/merge-string-no-null.s
test/elf2/merge-string.s

llvm-commits mailing list
llvm-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits

rafael updated this revision to Diff 38266.Oct 23 2015, 3:01 PM

OK, I think this version finally has it all:

  • Shares code for the string and non string case, reducing the patch and code size.
  • Tail merging is optional, using the same flag (-O2) as gold.
  • Work with llvm trunk.
  • More comments.
ruiu added a comment.Oct 23 2015, 3:11 PM

Generally looks good, but can you add a few comments as I pointed out in the previous review?

ELF/OutputSections.cpp
720–728

This path should work for !shouldTailMerge() too, no?

Bigcheese added inline comments.Oct 23 2015, 6:23 PM
ELF/InputFiles.cpp
125

I dislike the name of this function. SectionType seems to indicate the SHT_* values. getSectionKind would be fine.

rafael added inline comments.Oct 24 2015, 5:43 AM
ELF/OutputSections.cpp
720–728

No, the buffer is only created during finalize.

rafael updated this revision to Diff 38301.Oct 24 2015, 5:45 AM

More comments, change getSectionType to return a bool and rename it.

I think this covers all the requests.

ruiu accepted this revision.Oct 24 2015, 10:09 AM
ruiu edited edge metadata.

LGTM

This revision is now accepted and ready to land.Oct 24 2015, 10:09 AM
Eugene.Zelenko added a subscriber: Eugene.Zelenko.

Committed in r251212.