This is an archive of the discontinued LLVM Phabricator instance.

Optimize orphan plament in a general way
ClosedPublic

Authored by rafael on May 11 2017, 5:18 PM.

Details

Reviewers
ruiu
Summary

We used to place orphans by just using compareSectionsNonScript.

Then we noticed that since linker scripts can use another order, we should first try match the section to a given PT_LOAD. But there is nothing special about PT_LOAD. The same issue can show up for PT_GNU_RELRO for example.

In general, we have to search for the most similar section and put the orphan next to it. Most similar being defined as how long they follow the same code path is compareSecitonsNonScript.

That is what this patch does. We now compute a rank for each output section, with a bit for each branch in what was compareSectionsNonScript.

With this findOrphanPos is now fully general and orphan placement can be optimized by placing every section with the same rank at once.

The included testcase is a variation of many-sections.s that uses allocatable sections to avoid the fast path in the existing code. Without threads it goes form 46 seconds to 0.9 seconds.

Diff Detail

Event Timeline

rafael created this revision.May 11 2017, 5:18 PM
ruiu accepted this revision.May 11 2017, 5:38 PM

LGTM

ELF/Writer.cpp
653–654

You can use .count() instead of .find().

667–668

You can return Rank | RF_NOT_ALLOC

762–763

Can you use at instead of find?

This revision is now accepted and ready to land.May 11 2017, 5:38 PM
andrewng added inline comments.
ELF/Writer.cpp
645

Is there a reason why this starts with 1 and not 0? i.e. 1 << 0

espindola closed this revision.Mar 14 2018, 4:03 PM
espindola added a subscriber: espindola.

302903