This is an archive of the discontinued LLVM Phabricator instance.

[ELF] - Stop producing broken entries in .debug_ranges section
AbandonedPublic

Authored by grimar on Mar 29 2017, 8:08 AM.

Details

Reviewers
ruiu
rafael
Summary

Address range lists are contained in a separate section called .debug_ranges.
In short it contains start/end address range pairs.
zero/zero pair means the end of range list. Values of this section
in object files are relocatable.

Perviously LLD would generate wrong output if relocation applied
was against discarded section. LLD would write addend of relocation in that case,
what produced broken ranges, like (0x00000000 - [Addend value]), for example.

According to DWARF specification, it is safe to produce range entries with
same begining and end address: "A range list entry (but not a base address selection or end of list entry) whose beginning and
ending addresses are equal has no effect because the size of the range covered by such an
entry is zero"

GNU LD in such case produces stub entry with start/end = 1.
(https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commitdiff;h=e4067dbb2a3368dbf908b39c5435c84d51abc9f3)

I suggest to do the same. I am planning to use it for .gdb_index, D31424 description contains information about issue I faced,
this patch should fix it and that allows to use relocated output for generating .gdb_index section in final without additional parsing
of relocations.

Probably we want to do the same for other debug sections (but place zero instead of 1, that is what ld do, though I did not investigate that deeper),
but now I faced problems only with .debug_ranges, so patch changes behavior for that section only.

Diff Detail

Event Timeline

grimar created this revision.Mar 29 2017, 8:08 AM

Does GNU ld (binutils, or gold, or both?) actually special case this relocation in this section? (ie: a small hand-written assembly file with only debug_ranges ends up with different contents if you rename the section to debug_foo_ranges? (zeros in the latter case, 1s in the original case)

Does GNU ld (binutils, or gold, or both?) actually special case this relocation in this section? (ie: a small hand-written assembly file with only debug_ranges ends up with different contents if you rename the section to debug_foo_ranges? (zeros in the latter case, 1s in the original case)

In my tests only GNU ld handled that, gold generated broken entry. And yes, GNU ld handles this section separatelly by name, see:
https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=blob;f=bfd/reloc.c;h=4e95d85f0b94aacb5995c6f2307f94e6ede206fb;hb=e4067dbb2a3368dbf908b39c5435c84d51abc9f3#l1591

Generally looks okay to me. I've made a few inline comments.

This approach works for .debug_ranges, because we are working with pairs of addresses, but we have to be more careful with other DWARF sections, because on some architectures, address zero is a valid address and should not be used, and we cannot simply set the size to zero, because we don't know where that is (if anywhere at all). gdb recognizes zero as special, if the architecture does not allow zero addresses, but otherwise it is not treated as special, and so we end up with (potentially) overlapping debug info entries which can cause problems with the debugger. Similarly, .debug_aranges uses an address and a fixed size, which we have to update correctly. There's a related discussion on the LLDB mailing list I started about handling GC'ed sections in the debugger: http://www.mail-archive.com/lldb-dev@lists.llvm.org/msg04140.html. The general agreement is that we should use a special non-zero address. I would suggest -1 (i.e. 0xFFFFFFFF in 32-bit DWARF addresses and 0xFFFFFFFFFFFFFFFF in 64-bit DWARF addresses), but we should have that discussion on the mailing lists.

ELF/InputSection.cpp
487

Should be called isInDiscardedSection, because it is working with a symbol not a section. "isDiscardedSection" suggests to me that it is asking if a section has been discarded or not, and so would take an InputSection or section index.

510

Is there anywhere else that we check section names? If so, I'd personally prefer if we set some flag on the section early rather than doing string comparisons on every input section to test the name multiple times.

533–538

A few grammar nits and minor improvements to this comment. I suggest the following:

"If an address range list contains an entry that belongs to a discarded section, we should perform a fix-up. We cannot write zero as the start and end address because such a pair is an end of range list marker in the .debug_ranges section. We put a placeholder value equal to 1, which is fine because a range list entry whose beginning and end address are equal has no effect, because the size of the range is zero in such case."

Also, one more thing: .debug_ranges can contain "base address selection entries" which have a -1 and address pair, hence why we cannot use -1 in this situation. I'm not sure how the address is set, as I have never seen a real-life example of this, but I'm wondering if there's some possible conflict with this patch, i.e. we might end up with a -1, 1 base selection entry, if there is a relocation to a discarded section used here. I have no idea if this is really something we need to worry about though (I think probably not).

Generally looks okay to me. I've made a few inline comments.

This approach works for .debug_ranges, because we are working with pairs of addresses, but we have to be more careful with other DWARF sections, because on some architectures, address zero is a valid address and should not be used, and we cannot simply set the size to zero, because we don't know where that is (if anywhere at all). gdb recognizes zero as special, if the architecture does not allow zero addresses, but otherwise it is not treated as special, and so we end up with (potentially) overlapping debug info entries which can cause problems with the debugger. Similarly, .debug_aranges uses an address and a fixed size, which we have to update correctly. There's a related discussion on the LLDB mailing list I started about handling GC'ed sections in the debugger: http://www.mail-archive.com/lldb-dev@lists.llvm.org/msg04140.html. The general agreement is that we should use a special non-zero address. I would suggest -1 (i.e. 0xFFFFFFFF in 32-bit DWARF addresses and 0xFFFFFFFFFFFFFFFF in 64-bit DWARF addresses), but we should have that discussion on the mailing lists.

Interesting. I worried about other sections and so this patch touches only .debug_ranges. Thanks for your comments !

ELF/InputSection.cpp
487

I agree.

510

This method is isolated and used for doing relocations for non allocatable things like debug sections.
I would not let this a bit wierd but reasonable logic to go outside.

Could we reduce the size of the test by removing many of the debug info attributes? I don't know what dwarfdump requires, but I expect we could get rid of basically all the DW_AT_... entries except the DW_AT_ranges tag.

test/ELF/debug-ranges-reloc.s
4

We only need the -debug-dump=ranges, since we are only interested in the .debug_ranges output.

11

No need for this line.

13–15

I think we could get rid of these lines (similarly below in main).

grimar updated this revision to Diff 94015.Apr 4 2017, 1:32 AM
  • Addressed review comments.
grimar added a comment.Apr 4 2017, 1:36 AM

Could we reduce the size of the test by removing many of the debug info attributes? I don't know what dwarfdump requires, but I expect we could get rid of basically all the DW_AT_... entries except the DW_AT_ranges tag.

I tried to reduce the testcase as much as possible in updated diff.

ELF/InputSection.cpp
533–538

Fixed. thanks !

test/ELF/debug-ranges-reloc.s
4

It does not work. Looks llvm-dwarfdump has an issue here (it does not dump anything with =ranges only for this testcase), I'll try to investigate and may be prepare patch for it separatelly. I suggest to go with =all for now.

11

Removed.

13–15

Removed.

LGTM, with the inline comments. One of @ruiu or @rafael should probably still approve it though.

ELF/InputSection.cpp
487

I think this can be a const reference.

533–538

Sorry, one more grammar nit I missed - the last bit should say "in such a case."

test/ELF/debug-ranges-reloc.s
4

I just ran into this problem as well! Fixing dwarfdump would be great.

grimar added inline comments.Apr 4 2017, 3:30 AM
ELF/InputSection.cpp
487

We usually do not use const references in LLD.
Its a bit complicated. I think no const is consistent with code and ok here.

533–538

Thanks !

grimar abandoned this revision.Apr 20 2017, 3:21 AM

Its not yet clear how much it is really needed, abandoning for now.