This is an archive of the discontinued LLVM Phabricator instance.

[ELF] - Use relocated content when generating .gdb_index
AbandonedPublic

Authored by grimar on Mar 28 2017, 9:08 AM.

Details

Reviewers
ruiu
rafael
Summary

Previously LLVM DWARF parsers were responsible to resolve some of
relocations when .gdb_index was generated. That was extremly slow.
Also requires some additional code from LLD side, which was not fast
either (like D31330 showed).

One of solutions would be to feed parser with relocated sections content,
current parsers API allows to pass sections directly. That is what patch do.

First, it performs parsing using non-relocated content as usual. That works for everything, except
.debug_ranges section. Patch does not ask DWARF parser to parse ranges on this step,
I just count the amount of ranges in total. That allows to know the .gdb_index section size
early.

Later .gdb_index section is processed in writeTo(). Since it is synthetic, it is added after all .debug_* sections
to the output sections list. So relocations for debug sections which performed in writeTo are
done before writeTo() call for .gdb_index.
During GdbIndexSection::writeTo() I use DWARF parsers just to retrive address ranges list.
It uses relocated .debug_range section content on this step. And that allows to get correct
relocated ranges.

That also helped to resolve issue mentioned in last comments for PR32319. Testcase is provided.

Note, that even with that patch output is still broken. Example:

// GOLD:

 Low address = 0x121beb3, High address = 0x121becd, CU index = 0 (0x121becd - 0x121beb3 == 0x1A)
...
  Low address = 0x121befa, High address = 0x121f132, CU index = 1 (0x121f132 - 0x121befa == 0x3238)
  Low address = 0x121f132, High address = 0x121f160, CU index = 1 (0x121f160 - 0x0x121f132 == 0x2E)
  Low address = 0x121f160, High address = 0x121f1fb, CU index = 1 (0x121f1fb - 0x121f160 == 0x9B)

//LLD:

Low address = 0x31b68a3, High address = 0x31b68bd, CU index = 0 (0x31b68bd - 0x31b68a3 == 0x1A)
 ...
 Low address = 0x31b68ea, High address = 0x31b9b22, CU index = 1 (0x31b9b22 - 0x31b68ea == 0x3238)
 Low address = 0x0, High address = 0x12, CU index = 1             <--- BROKEN EXCESSIVE ENTRY
 Low address = 0x31b9b22, High address = 0x31b9b50, CU index = 1 (0x31b9b50 - 0x31b9b22 == 0x2E)
 Low address = 0x31b9b50, High address = 0x31b9beb, CU index = 1 (0x31b9beb - 0x31b9b50 = 0x9B)

LLD section contains broken excessive entries. That happens because there are some amount of relocations in
.debug_ranges which are against discarded sections. In that case the value we write to section is an relocation
addend. I think that can be fixed if we would skip such entries somehow when work with .debug_ranges section (may be we should regenerate it to dismiss
all entries that are against discarded sections ? That usefull optimization itself).
Or we probably can put 2 similiar values instead of address range in that case, DWARF manual says that: "A range list entry (but not a base address selection or end of list entry) whose beginning and ending addresses are equal has no effect because the size of the range covered by such an entry is zero."
So it may probably work. That issue requires additional investigating still and not relative to this patch directly.

Diff Detail

Event Timeline

grimar created this revision.Mar 28 2017, 9:08 AM
grimar added a subscriber: evgeny777.
dblaikie added inline comments.Mar 28 2017, 9:25 AM
ELF/SyntheticSections.cpp
1739–1740

I don't understand what assumption you're describing.

Oh, because relocations haven't been performed, nor handled at all - it's not possible to tell where the address ranges end, so you're providing an overestimate here? That's a bit subtle and the function name should at least mention that I would think (& the function that calls this one too)

Is there any penalty to this overestimate in how the gdb_index is created? Does a larger sized record need to be used to handle a maximum that may never be needed?

[looking at the implementation, it seems there is a penalty for overestimate - a table is reserved for that many address ranges, right? so much of that table may go unused if this is a significant overestimate (though that probably only happens if someone uses ld -r or, perhaps more likely: LTO)]

I wouldn't imagine it'd be too hard to look at enough relocation data (essentially parse the relocations applicable to debug_ranges and look for holes (bytes that don't have relocations for them) - those are the ends of range lists - you wouldn't need to apply any relocations or examine their addends, etc, only their offsets)

grimar added inline comments.Mar 29 2017, 2:16 AM
ELF/SyntheticSections.cpp
1739–1740

I don't understand what assumption you're describing.

Oh, because relocations haven't been performed, nor handled at all - it's not possible to tell where the address ranges end, so you're providing >an overestimate here? That's a bit subtle and the function name should at least mention that I would think (& the function that calls this one >too)

Right. Its straightforward and fastest way to do things I believe. It is possible to tell where is address ranges end only after handling or some parsing of relocations. I think we can find the relocation with largest offset in .debug_range section and that be enough to know that next offset is the end here. (but that would not solve everything, please see below)
I can probably rename getDieRangesCount to estimateDieRangesCount and getAddressRangesCount to estimateAddressRangesCount ?

Is there any penalty to this overestimate in how the gdb_index is created? Does a larger sized record need to be used to handle a maximum >that may never be needed?

[looking at the implementation, it seems there is a penalty for overestimate - a table is reserved for that many address ranges, right? so much >of that table may go unused if this is a significant overestimate (though that probably only happens if someone uses ld -r or, perhaps more >likely: LTO)]

Yes, penalty is size. I am not aware about LTO, I knew about ld -r here only. I think perfomance is preferable over possible size penalty in general. And this patch just a first step in direction to use relocated data, so we can think about how to solve that after this change probably.

I wouldn't imagine it'd be too hard to look at enough relocation data (essentially parse the relocations applicable to debug_ranges and look for >holes (bytes that don't have relocations for them) - those are the ends of range lists - you wouldn't need to apply any relocations or examine >their addends, etc, only their offsets)

There is one more issue I mentioned in description - is when address range belongs to discarded section. This case can not be catched until relocations are scanned at later stages, because we cant just scan relocations early when we do not know which sections were discarded.
So since these issues are related, I would fix them in a separate patch(es). Scanning of relocations also will probably affect perfomance, and should be done in a some smart way.

grimar abandoned this revision.May 15 2017, 3:47 AM

Abandoning for now, because different approach (D33122) looks more promising direction.