This is an archive of the discontinued LLVM Phabricator instance.

[ELF] - Speedup readAddressArea() implementation.
AbandonedPublic

Authored by grimar on May 4 2017, 2:58 AM.

Details

Reviewers
ruiu
rafael
Summary

readAddressArea() implementation in part of searching inputsection
that range belongs to is ineffective. Implementation is O(N*M) where
N is number of input sections and M is number of ranges. Since
each input section may have own range, complexity can be quadratic.

Patch reimplements it using binary search.

For benchmark I used debug LLC binary built with -gsplit-dwarf and -ggnu-pubnames
Numbers are (20 runs for each):

  1. Original LLD without --gdb-index link time is 1,644984654 seconds ( +- 0,28% )
  2. LLD without patch + --gdb-index link time is 6,200055217 seconds ( +- 0,15% )
  3. LLD with this patch + --gdb-index link time is 4,305998886 seconds ( +- 0,16% )

Time spent for building gdbindex changes from (6.2 - 1.644 == 4.556) to (4.305 - 1.644 == 2.664).
That is 2.664/4.556 = 0.584 or about 42% speedup.

I also tried different solution without use of additional vector, demo is available: D32851, but this
patch is faster and looks cleaner.

Diff Detail

Event Timeline

grimar created this revision.May 4 2017, 2:58 AM
dblaikie added inline comments.May 4 2017, 10:27 AM
ELF/SyntheticSections.cpp
1712–1715

This is often done nested like this, if you like

V.erase(llvm::remove_if(V, []...), V.end());
1724

Is 'V' guaranteed to be sorted?

1728

need any error handling for if R.first isn't in any entry in V?

grimar updated this revision to Diff 97921.May 5 2017, 3:02 AM
  • Addressed review comment.
ELF/SyntheticSections.cpp
1712–1715

Yeah. I did that initially and rewrote later, because that looks a bit bulky for my eye after clang-format.
Made as nested again.

1724

We initialize File->Sections in the same order they are in objects section table:
https://github.com/llvm-mirror/lld/blob/master/ELF/InputFiles.cpp#L282

From my observations input sections in objects are always being sorted by file offset.
(just because usual code seems assigns file offsets for sections first and then writes them in the same order or something like that)
That probably not a mandatory requirement, but looks to be reasonable assumption.

I can probably add some additional assert-check to verify array is sorted or call sort explicitly, not sure if latter worth to do.

Not relative to this patch:
Another Idea I had in mind is to make DWARFAddressRangesVector to store not only Low/High index, but also a object::SectionRef for example.
That would remove the need of returning fake section address from llvm::LoadedObjectInfo::getSectionLoadAddress and should allow LLD to get
section index from SectionRef somehow. Then we will not need any loops for doing a range section search and approach would be cleaner and faster.
But that would require DWARF parsers API change, I would probably not do that change right now, because it is not yet clear if we want it or not, because for example using relocated output for building index solves this problem in a different way, but at the same time I guess we can build index using multithreading and it is probably easier to not use relocated output for that and just parse each file in a parrallel loop. I am going to investigate that all and for now would go with current patch as it is simple LLD side solution that gives solid instant speedup.

1728

I though about that too. As far I understand that can happen only if we have some broken range in DWARF,
like corrupted entry in .debug_ranges section. I probably do not see other way to achieve that.
And that is a dillemma about what to do then. We can do additional check of course, but at the same time,
AFAIK LLD policy is that we are OK to produce broken output if input was broken.

I can do additional check here and error out, just don't sure if we really need it (how much the case is real to happen).

dblaikie added inline comments.May 5 2017, 2:13 PM
ELF/SyntheticSections.cpp
1724

If it's not guaranteed by ELF to be sorted - best not to rely on it. You might be able to construct a test case by modifying LLVM's MC layer and having it produce the headers in reverse order (or just swapping the first two, etc).

grimar added inline comments.May 12 2017, 3:49 AM
ELF/SyntheticSections.cpp
1724

I see, thanks for suggestion and your time. I think this patch becomes a bit complicated solution then. If we need to add sorting that probably will bring some slowdown and code complication, new testcase with swapped sections will be just temporarily needed, because if we will change the DWARF parsers API to return section index for ranges somehow instead, for example, we will not need to do any slow things like sorting+searching on LLD side anymore at all.

Today I posted patch for building .gdb_index in a multithreaded way (D33122). That patch also suffers from weakness I am trying to fix here.
If will be desided that we are fine to go with D33122, then it seems to me that it will worth to extend DWARF API instead of continue work on this patch.

grimar abandoned this revision.May 15 2017, 3:12 AM

I believe D33183 is much better.