This is an archive of the discontinued LLVM Phabricator instance.

[ELF] - Implemented --symbol-ordering-file option.
ClosedPublic

Authored by grimar on Oct 30 2016, 10:36 AM.

Details

Summary

This is close to D25766, but uses list of symbols instead of list of sections.

Numbers I got benchmarking clang -help were:
For 1000 runs of ordered and unordered build amount of pagefaults: 636 vs 641 +- 0.01%.
Time was (from real HW) 0.124324230 vs 0.127807916 (+- 0.2%), so it is 2.8% boost.

Used symbols files is https://justpaste.it/zwm4

Diff Detail

Event Timeline

grimar updated this revision to Diff 76337.Oct 30 2016, 10:36 AM
grimar retitled this revision from to [ELF, WIP] - Implemented --symbol-ordering-file option..
grimar updated this object.
grimar added reviewers: ruiu, rafael, davide.
grimar added subscribers: grimar, evgeny777, llvm-commits.
grimar updated this revision to Diff 76396.Oct 31 2016, 7:14 AM
grimar retitled this revision from [ELF, WIP] - Implemented --symbol-ordering-file option. to [ELF] - Implemented --symbol-ordering-file option..
grimar updated this object.
  • Refactored.
  • Added testcase showing ability to sort sections with the same name (.text).
ruiu edited edge metadata.Oct 31 2016, 1:58 PM

Well, again, you really need to run a long sequence of computations to get meaningful numbers for benchmark. clang -help isn't useful for that purpose.

davide edited edge metadata.Oct 31 2016, 2:08 PM
In D26130#583948, @ruiu wrote:

Well, again, you really need to run a long sequence of computations to get meaningful numbers for benchmark. clang -help isn't useful for that purpose.

Agree (as already pointed out in the other thread review). I discussed this a bit with Rafael and I see his point of view, i.e. clang -help is definitely a sanity check, and the number of page faults being decreased is a good thing, but I'd really love this to be tested on a startup-time sensitive application.
I mentioned browsers, in particular Firefox, because the Mozilla people spent quite a bit of time thinking about the problem, so you should be able to reuse all their infrastructure (tests etc..). Firefox should link out-the-box with lld, and if it doesn't .. well there's more work to to there =)

grimar added a comment.EditedNov 1 2016, 1:07 AM
In D26130#583948, @ruiu wrote:

Well, again, you really need to run a long sequence of computations to get meaningful numbers for benchmark. clang -help isn't useful for that purpose.

Yes, Rui, please don't think I forgot about that opinion :) clang -help looks good because of very stable results. As I pointed I observe the same amount of faults for the same binary on 2 different PC. So it defenetely (IMO) proves that patch can be used to reduce them during startup.

In D26130#583948, @ruiu wrote:

Well, again, you really need to run a long sequence of computations to get meaningful numbers for benchmark. clang -help isn't useful for that purpose.

Agree (as already pointed out in the other thread review). I discussed this a bit with Rafael and I see his point of view, i.e. clang -help is definitely a sanity check, and the number of page faults being decreased is a good thing, but I'd really love this to be tested on a startup-time sensitive application.
I mentioned browsers, in particular Firefox, because the Mozilla people spent quite a bit of time thinking about the problem, so you should be able to reuse all their infrastructure (tests etc..). Firefox should link out-the-box with lld, and if it doesn't .. well there's more work to to there =)

I did not yet try firefox with this patch. But as I wrote in another thread, I was unable to obtain a boost using --sections-ordering-file on clang and firefox previously. Though sections file for firefox produced by icegrind was 5 megabytes in size, what is much more than 300 entries that works with this patch.
So the noticable result using --symbol-ordering-file and Rafael's method of generating it is probably a step forward that was worth to share.

Will try to test firefox using this patch.

(about linking firefox with lld. When last time I did that, it produced next error for me during launch:

umb@ubuntu:~/firefox/clean/mozilla-central$ /home/umb/firefox/clean/mozilla-central/obj-x86_64-pc-linux-gnu/dist/bin/firefox -no-remote -profile /home/umb/firefox/clean/mozilla-central/obj-x86_64-pc-linux-gnu/tmp/scratch0
XPCOMGlueLoad error for file /home/umb/firefox/clean/mozilla-central/obj-x86_64-pc-linux-gnu/dist/bin/libxul.so:
/home/umb/firefox/clean/mozilla-central/obj-x86_64-pc-linux-gnu/dist/bin/libxul.so: undefined symbol: gtk_main_do_event
Couldn't load XPCOM.

so I had to link firefox with gold and then relink libxul.so with lld. That should have being fine I think since that is the single core library of firefox and sections file was targeted for it.
And anyways latest tests I performed on vanilla gold to check the option itself, and result was the same).

rafael edited edge metadata.Nov 1 2016, 5:42 AM

Just some quick observations while building the patch :-)

ELF/SymbolListFile.cpp
65

It is not entirely clear you want the ScriptParser infrastructure. For example, do we have to handle quotes?

I would probably make this format dead simple. One symbol per line, symbols with newlines are not supported.

ELF/Writer.cpp
679

This is crazy inefficient.

You probably want to make SymbolOrderingFile a DenseMap from CachedStringRef to unsigned.

test/ELF/symbol-ordering-file.s
9

Changing .text to .foo should make this easier to read as .foo doesn't have an implicit alignment.

grimar added inline comments.Nov 1 2016, 9:45 AM
ELF/Writer.cpp
679

Yeah.. that was stupid from my side. I am going to address this one and other comments tomorrow.

grimar updated this revision to Diff 76695.Nov 2 2016, 5:19 AM
grimar edited edge metadata.
  • Addressed review comments.
ELF/SymbolListFile.cpp
65

I expected this file can have comments like --sections--list-file.
Not sure that more parsing code here is better than reusing ScriptParser. I did that though.

test/ELF/symbol-ordering-file.s
9

linker does not concatinate .foo into single section like .text.* :(
So I had to use .text

grimar updated this revision to Diff 76827.Nov 3 2016, 3:19 AM
  • Simplified.
davide added a comment.Nov 6 2016, 3:58 PM

Some comments inline.

ELF/SymbolListFile.cpp
7–8

Ehm, splitting a file by line is something that can be probably useful outside of lld, so I propose this to live in ADT/StringRef (or something like that).

21–22

Unless I'm mistaken, you can probably fold Arr.

ELF/Writer.cpp
39

nit: using the list
nit2: s/option//

41

I find the name sortSectionsByOrder a little confusing.

54–55

I prefer details about symbols not being exposed.

So, any reason why you can't use

template <class ELFT>
static StringRef getSymbolName(const elf::ObjectFile<ELFT> &File,
                               SymbolBody &Body) {

here?

64–65

This just duplicates the comment at the beginning of the function, so I'd either
a) make it more useful
b) remove it altogether

davide added inline comments.Nov 6 2016, 4:06 PM
test/ELF/symbol-ordering-file.s
9

Can't you just use multiple input files with two .foo sections?

davide added a comment.Nov 6 2016, 4:07 PM
In D26130#583948, @ruiu wrote:

Well, again, you really need to run a long sequence of computations to get meaningful numbers for benchmark. clang -help isn't useful for that purpose.

Yes, Rui, please don't think I forgot about that opinion :) clang -help looks good because of very stable results. As I pointed I observe the same amount of faults for the same binary on 2 different PC. So it defenetely (IMO) proves that patch can be used to reduce them during startup.

In D26130#583948, @ruiu wrote:

Well, again, you really need to run a long sequence of computations to get meaningful numbers for benchmark. clang -help isn't useful for that purpose.

Agree (as already pointed out in the other thread review). I discussed this a bit with Rafael and I see his point of view, i.e. clang -help is definitely a sanity check, and the number of page faults being decreased is a good thing, but I'd really love this to be tested on a startup-time sensitive application.
I mentioned browsers, in particular Firefox, because the Mozilla people spent quite a bit of time thinking about the problem, so you should be able to reuse all their infrastructure (tests etc..). Firefox should link out-the-box with lld, and if it doesn't .. well there's more work to to there =)

I did not yet try firefox with this patch. But as I wrote in another thread, I was unable to obtain a boost using --sections-ordering-file on clang and firefox previously. Though sections file for firefox produced by icegrind was 5 megabytes in size, what is much more than 300 entries that works with this patch.
So the noticable result using --symbol-ordering-file and Rafael's method of generating it is probably a step forward that was worth to share.

Will try to test firefox using this patch.

(about linking firefox with lld. When last time I did that, it produced next error for me during launch:

umb@ubuntu:~/firefox/clean/mozilla-central$ /home/umb/firefox/clean/mozilla-central/obj-x86_64-pc-linux-gnu/dist/bin/firefox -no-remote -profile /home/umb/firefox/clean/mozilla-central/obj-x86_64-pc-linux-gnu/tmp/scratch0
XPCOMGlueLoad error for file /home/umb/firefox/clean/mozilla-central/obj-x86_64-pc-linux-gnu/dist/bin/libxul.so:
/home/umb/firefox/clean/mozilla-central/obj-x86_64-pc-linux-gnu/dist/bin/libxul.so: undefined symbol: gtk_main_do_event
Couldn't load XPCOM.

so I had to link firefox with gold and then relink libxul.so with lld. That should have being fine I think since that is the single core library of firefox and sections file was targeted for it.
And anyways latest tests I performed on vanilla gold to check the option itself, and result was the same).

If lld mislinks firefox, we shouldn't silently ignore it and try to link with gold, but open a bug instead (and/or try to fix).

grimar added a comment.Nov 7 2016, 1:47 AM

If lld mislinks firefox, we shouldn't silently ignore it and try to link with gold, but open a bug instead (and/or try to fix).

I think it is just an issue of their build system.
I found similar issues: https://bugzilla.mozilla.org/show_bug.cgi?id=1080654, https://bugzilla.mozilla.org/show_bug.cgi?id=1063359

widget/gtk/mozgtk/gtk3/moz.build has next lines:

if CONFIG['GCC_USE_GNU_LD']:
    no_as_needed = ['-Wl,--no-as-needed1']
    as_needed = ['-Wl,--as-needed']
else:
    no_as_needed = []
    as_needed = []

If I change the last two to have -Wl..., then build works fine for me.

test/ELF/symbol-ordering-file.s
9

I think reasonable minimum to show that sorting is performed according to order file is
at least 3 sections. So that would be main test file + 2 input files.

I find testcases that use additional inputs harder to read generally, so actually I would prefer current way
when testcase is just a two files in total.
My test has 5 sections what also allows to check spaces before/after symbol in ordering file,
doing that with .foo and multiple inputs would require 4 input files I think.

grimar updated this revision to Diff 77024.Nov 7 2016, 2:58 AM
  • Addressed review comments.
ELF/SymbolListFile.cpp
7–8

StringRef::split() can do this job, used it instead.

21–22

We don't use regexps in lld anymore. StringMatcher has a list of patterns,
them are unfolded.
So I think keeping SymbolOrderingFile as a DenseMap should be faster.

ELF/Writer.cpp
39

Done.

41

May be sortBySymbolsOrder ?

54–55

Done.

64–65

Removed.

Please also add to the test to show that if a section has multiple symbols, we use the one with the lowest index in the order file.

ELF/SymbolListFile.cpp
66

I would probably just use 0 instead of 32. This will not be small in any interesting case.

ELF/Writer.cpp
689

In case two symbols point to the same section, we probably want the first one to win. This means that this should probably use the minimum.

emaste added a subscriber: emaste.Nov 7 2016, 7:14 AM

If lld mislinks firefox, we shouldn't silently ignore it and try to link with gold, but open a bug instead (and/or try to fix).

A Firefox bug, it seems.

if CONFIG['GCC_USE_GNU_LD']:

It looks like Firefox uses this rather unfortunate test:

if test "$GNU_CC"; then
    if `$CC -print-prog-name=ld` -v 2>&1 | grep -c GNU >/dev/null; then
        GCC_USE_GNU_LD=1
    fi
fi

and there's been a history of issues detecting the linker type, e.g. https://bugzilla.mozilla.org/show_bug.cgi?id=1049510.

I suspect we could set this to 1 also for lld (at least, I think it's reasonable that lld will support ld.bfd options or behaviour that Firefox relies on). However, it seems like someone familiar with Firefox ought to audit all of the uses of GCC_USE_GNU_LD and replace them with tests on the actual functionality required.

grimar updated this revision to Diff 77043.Nov 7 2016, 8:44 AM
  • Addressed review comments.
ELF/SymbolListFile.cpp
66

Done.

ELF/Writer.cpp
689

Right. It used minimum when had one more "crazy" loop outsize, but after switching to DenseMap this logic gone accidentally. Fixed.

rafael accepted this revision.Nov 7 2016, 8:57 AM
rafael edited edge metadata.

LGTM by me, but please wait for a LGTM from Rui and Davide too.

Thanks for implementing this!

This revision is now accepted and ready to land.Nov 7 2016, 8:57 AM
ruiu added inline comments.Nov 8 2016, 1:40 PM
ELF/SymbolListFile.cpp
60

This function should live in Driver.cpp as this has nothing to do with ScriptParser class.

ELF/Writer.cpp
680

I think we should iterate over Config->SymbolOrderingFile and use SymbolTable::find() instead of iterating over all symbols because the former should be much faster than the latter.

grimar updated this revision to Diff 77315.Nov 9 2016, 1:29 AM
grimar edited edge metadata.
  • Addressed review comments.
grimar added inline comments.Nov 9 2016, 1:31 AM
ELF/SymbolListFile.cpp
60

Done.

ELF/Writer.cpp
680

As Rafael already pointed out, there is no local symbols in symbol table.

ruiu accepted this revision.Nov 9 2016, 1:25 PM
ruiu edited edge metadata.

LGTM

davide accepted this revision.Nov 9 2016, 2:07 PM
davide edited edge metadata.

This looks good now. Thanks for your extreme patience addressing all my (ours) comments.

Cool ! Davide, Rafael, Rui, thank you guys for usefull comments and help with that !

This revision was automatically updated to reflect the committed changes.