Page MenuHomePhabricator

[LLD][ELF][DebugInfo] llvm-symbolizer shows incorrect source line info if --gc-sections used
Needs ReviewPublic

Authored by avl on Mar 19 2019, 10:34 AM.

Details

Summary
 [LLD][ELF][DebugInfo] llvm-symbolizer shows incorrect source line info if --gc-sections used for building binary.
    
 This patch is a fix for https://bugs.llvm.org/show_bug.cgi?id=41124 .
    
 When -Wl,--gc-sections is used : unreferenced .text sections could be removed,
 but debug info related to these deleted sections left unchanged.
 Address ranges for this not needed debug info could overlap with
 address ranges for correct code. In such case symbolizer shows
 line information incorrectly. 

 The real root of the problem is that irrelevant debug information left in the 
 binary. So the proper fix would be to delete that information completely. It would 
 probably be necessary to  store debug information in COMDAT sections so that 
 linker would be able to remove not only unreferenced .text section but related debug 
 info also.

That fix implements workaround for the problem. It resolves relocations related to
debug info for deleted code so that they do not clash with real code. To achieve this
it set address ranges for such unneeded debug info out of module scope.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
ruiu added inline comments.Mar 21 2019, 2:24 PM
lld/ELF/InputSection.cpp
836

nit:

"If relocation points to the deleted section (-Wl,--gc-sections) then -> it is necessary"

"If relocation points to the deleted section (-Wl,--gc-sections), then it is necessary ..."

837

Please use apostrophe instead of backquote.

841

No need to sign-extend because this 64-bit value will never be sign-extended.

ruiu added a comment.Mar 21 2019, 2:27 PM

Looks like the test files are huge. I order to test this functionality alone, you don't need a small file containing one relocation pointing to a to-be-discarded section. You don't really have to test it against actual data.

s/you don't need/you don't need anything but/

avl updated this revision to Diff 191842.Mar 22 2019, 3:29 AM

updated tests and addressed comments.

grimar added inline comments.Mar 22 2019, 3:37 AM
lld/test/ELF/Inputs/gc-sections-debuginfo-32.s
7 ↗(On Diff #191842)

You do not need these 2 new input files I think.
Just inline the code like we often do in LLD (e.g. https://github.com/llvm-mirror/lld/blob/master/test/ELF/linkerscript/no-filename-spec.s)

lld/test/ELF/gc-sections-debuginfo.s
35

Please remove the spaces after ...FFF and | symbol, no need to test we print it here.

jhenderson added inline comments.Mar 22 2019, 3:49 AM
lld/test/ELF/gc-sections-debuginfo.s
35

It's worth noting that unless you are using FileCheck's --strict-whitespace, then all contiguous runs of whitespace, both in the pattern and the output, are treated as a single space, so this extra spacing has no effect currently either. If you want to be sure that the number ends when it does, with not output afterwards, you can also do something like FFFF{{ }} or FFFF{{$}} which will check that a space or end of line respectively follows immediately.

@jhenderson and I did a prototype of DWARF-unit-per-function last year, and I was not favorably impressed by the numbers. But maybe he still has the actual data kicking around somewhere. We did not solve all the issues to our satisfaction before we ran out of time.
I've also had chats with @bd1976llvm about fragmenting .debug_info per-function without wrapping everything in units; you get into requiring some section-order things but it saves the unit-per-function overhead.

This is pretty much what I was talking about. You definitely don't need to do it as units if you're just including the concrete function. FWIW this is what I was talking about in the dwarf meeting where we talked about bringing units back into debug_info :)

And now I understand what you were saying back then! Section fragments for concrete but strippable functions does make sense, given that the object file can be constructed in a way that the linker doesn't need to know anything special in order to form a syntactically correct unit. Cross-fragment references need to be relocations not immediate, which is sad, but we get a size benefit in the end.

lld/test/ELF/gc-sections-debuginfo.s
35

I disagree. Certainly for the 32-bit case we want to show that there is only a 32-bit value, and to do that we must check for something after the value. For consistency doing the same for the 64-bit case is appropriate.

<Trying again via phabricator rather than replying on the list.>

Oh, and as far as dwarf aware linking I'd arguably prefer a separate tool ala dsymutil or dwz rather than incorporating it into the linker. I do agree that such a tool is incredibly useful and should be part of our ongoing development plan around debug info.

dwz looks great. I was unaware that it existed. Thanks!

Is GNU compatibility an issue? I did some research on this a while ago:

GOLD:
Stripping (gc-sections):
.debug_info - patch low pc to 0
.debug_ranges - patch low address to 0 and high address to "size of function + 1".

Deduplication (icf):
.debug_info - patch low pc to the address of the chosen function
.debug_ranges - patch the start and end addresses to the start and end addresses of the chosen
function

COMDATs:
.debug_info - patch low pc to 0
.debug_ranges - patch the start and end addresses to the start and end addresses of the chosen
function

GNU LD:
Stripping (gc-sections):
.debug_info - patch low pc to 0
.debug_ranges - patch the low and high address to 1 to make a size 0 range.

Deduplication (icf):
.debug_info - N/A
.debug_ranges - N/A

COMDATs:
.debug_info - patch low pc to 0
.debug_ranges - patch the start and end addresses to the start and end addresses of the chosen
function
What does GDB make of -1/-2???

grimar added inline comments.Mar 22 2019, 3:56 AM
lld/test/ELF/gc-sections-debuginfo.s
35

I think in the given case when we just have .long/.quad relocation in the inputs we perhaps do not need any additional testing. We actually care only about first 4/8 bytes of the section and their value.

For clarity, we might add a check for the section Size field perhaps though.

grimar added inline comments.Mar 22 2019, 4:09 AM
lld/test/ELF/gc-sections-debuginfo.s
35

Testing of the Size field of the section should work for this purpose.
(I think it is the best way to know how much data section contains).

avl added a comment.Mar 22 2019, 6:55 AM

@bd1976llvm

What does GDB make of -1/-2???

Speaking of GDB it looks like it behaives exactly the same. Please check results for the example from bug 41124 :

[LD]clang++ -gdwarf-5 -O -Wl,--gc-sections -fuse-ld=ld not_used.o main.o -o res.out
gdb res.out
(gdb) break main.cpp:1
Breakpoint 1 at 0x400497: file main.cpp, line 1.
(gdb) break not_used.cpp:1
Breakpoint 2 at 0x0: file not_used.cpp, line 1.
(gdb) list main
1 int main(void) {
2 return 0;
3 }
(gdb) list foo
Function "foo" not defined.
(gdb) list main.cpp:1
1 int main(void) {
2 return 0;
3 }
(gdb) list not_used.cpp:1
1 void foo () {
2 asm(".rept 6000000; nop; .endr");
3 }

[LLD]clang++ -gdwarf-5 -O -Wl,--gc-sections -fuse-ld=lld not_used.o main.o -o res.out
gdb res.out
(gdb) break main.cpp:1
Breakpoint 1 at 0x2010e7: file main.cpp, line 1.
(gdb) break not_used.cpp:1
Breakpoint 2 at 0xfffffffffffffffe: file not_used.cpp, line 1.
(gdb) list main
1 int main(void) {
2 return 0;
3 }
(gdb) list foo
Function "foo" not defined.
(gdb) list main.cpp:1
1 int main(void) {
2 return 0;
3 }
(gdb) list not_used.cpp:1
1 void foo () {
2 asm(".rept 6000000; nop; .endr");
3 }

Speaking of gnu addr2line it looks like it become to work more correctly :

[LD]clang++ -gdwarf-5 -O -Wl,--gc-sections -fuse-ld=ld not_used.o main.o -o res.out
addr2line -e res.out 0x400497
/home/avl/bugs/gc_debuginfo/not_used.cpp:2 <<<<<<<<<<<<<<<<<<<<<<<<<< incorrect

[LLD]clang++ -gdwarf-5 -O -Wl,--gc-sections -fuse-ld=lld not_used.o main.o -o res.out
addr2line -e res.out 0x2010e7
/home/avl/bugs/gc_debuginfo/main.cpp:3

avl updated this revision to Diff 191909.Mar 22 2019, 10:43 AM

updated tests as requested.

I have no more comments, this looks fine to me now.
A minor nit is inline.

lld/test/ELF/gc-sections-debuginfo.s
12

Hint: you can use -check-prefixes=CHECK,CHECK32

I think at this point the "debug info cabal" is more in favor of fragmenting the DWARF emitted by the compiler, so that it can be gc'd in tandem with the related functions. That solution costs extra relocations and extra sections in the .o file, but uses all existing mechanisms in the linker to achieve the goal.

But that opinion is based on the chat here and in D54747, and without a real llvm-dev discussion behind it.

avl added a comment.Thu, Mar 28, 5:29 AM

@probinson

I think at this point the "debug info cabal" is more in favor of fragmenting the DWARF emitted by the compiler, so that it can be gc'd in tandem with the related functions.

AFAIU this decision does not make this patch obsolete. Until above solution implemented or if some concrete dwarf producer would not implement that solution at all - it would be useful to have better line reporting. This patch improves llvm-symbolizer and addr2line reporting right after it would be integrated.
It also does not prevent future implementation of better solution (including D54747). D54747 solves some sort of problems but not all. There would still be cases with overlapping address ranges because of not removed debug info.

I think it makes sense to integrate that patch and keep it when(if) above "fragmenting the DWARF " solution would be done.

A resolution on the compiler/assembler side of things (basically i.e. not rewriting DWARF) will still need something in the linker for a while, due to old objects, if we want to have a perfect experience.

lld/ELF/InputSection.cpp
834

This will patch any non-alloc section that is relocated, not just .debug* sections. I think it should be scoped to those kinds of sections only, as it could break the behaviour for other unknown sections that people may have.

835

If relocation -> If the relocation

You don't need to put "-Wl," in the switch, since we're in the linker here already. Just use --gc-sections directly.

836

to make start offset to be out of -> to make the start offset be outside the

837

to not to overlap with -> to not overlap with the

838

This bit doesn't quite make sense. Perhaps "We use UINT64_MAX-1 because..."

lld/test/ELF/gc-sections-debuginfo.s
2

General nit throughout this test: you are using a mixture of '-' and '--' for long switches, which is a bit jarring. Please switch to using '--' throughout, if possible.

This test should test other debug section patching too, as the problem is not specific to .debug_info. The ones I'm thinking of in particular are .debug_types, .debug_ranges, .debug_line and .debug_aranges. Since the behaviour is actually patching of any debug section, perhaps it would be better to just call the section .debug_foo or similar?

5

I don't know what the common format in LLD tests is off the top of my head, but I find it easier if the '|' is on the following line:

# RUN: echo .... \
# RUN:   | llvm-mc ...
7

Nit: spurious extra space between -sections and -section-data

12

Also applies above.

15

Replace:
"When -Wl,--gc-sections used - linker ..."
with:
"When --gc-sections is used, the linker ..."

16

sections. But their... -> sections, but their

Use "debug data" instead of "debug_info" throughout here, because .debug_info is just one section in the debug data.

18

address'es -> addresses

19

out of module -> out of the module

20

relocation from -> relocations from the

21

to deleted section .text -> to a deleted .text section
into -> to

grimar added inline comments.Thu, Mar 28, 6:24 AM
lld/ELF/InputSection.cpp
834

Hmmm... I do not agree here. I think if the user's section assumes that linker resolves the relocation to a removed section as a zero, that does not seem the correct assumption to me. I doubt we should care that we switched the behavior here (or at least we probably want to know about such cases before explicitly supporting them, I think).

avl marked an inline comment as done.Thu, Mar 28, 6:35 AM
avl added inline comments.
lld/ELF/InputSection.cpp
834

In some previous version of the patch there was a check for Name.startswith(".debug") which I was asked to remove because "checking the symbol name is very hacky". Would it be OK to return it here ? Or probably solution similar to https://reviews.llvm.org/D54747 would be OK ? (create a flag Debug, set it Debug = Name.startswith(".debug"), and check it here)

avl updated this revision to Diff 192656.Thu, Mar 28, 9:26 AM

addressed all comments.

jhenderson added inline comments.Thu, Mar 28, 9:55 AM
lld/ELF/InputSection.cpp
834

I don't feel strongly about it, so if there's opposition, what is here is probably fine, I think.

835

Nit: the -> a

lld/test/ELF/gc-sections-debuginfo.s
5

Could you indent this like you were before, please? Same below.

16

info left -> data is left

17

overlapp -> overlap

Apologies that I didn't check the information on the original bug. The GDB behaviour does look reasonable, thanks.

lld/ELF/InputSection.cpp
834

Checking the section name (not symbol name) is not *hacky* in this case. Section names that start with the prefix ".debug" are reserved "for the system". Checking that the section name starts with the prefix ".debug" is the correct ELF way to determine if it is a debug information section.

avl marked an inline comment as done.Thu, Mar 28, 10:27 AM
avl added inline comments.
lld/test/ELF/gc-sections-debuginfo.s
5

I did not get it. What exactly should be done ?

this :

  1. RUN: echo '.section .text,"a"; .byte 0; .section .debug_foo,"",@progbits; .quad .text' \
  2. RUN: | llvm-mc --filetype=obj --arch=x86-64 - -o %t

should be change into this ?

  1. RUN: echo '.section .text,"a"; .byte 0; .section .debug_foo,"",@progbits; .quad .text' | \
  2. RUN: llvm-mc --filetype=obj --arch=x86-64 - -o %t
avl updated this revision to Diff 192691.Thu, Mar 28, 11:34 AM

addressed all comments, except indentation one, which I did not understand yet.

grimar added inline comments.Fri, Mar 29, 1:19 AM
lld/ELF/InputSection.cpp
834

But testing the symbol name inside a hot relocation loop and changing the behavior depending on that is itself a bit hacky, isn't?
My main point I think is that if there is a chance we can just do not do it - I would try it. If that will not work for some reason - then it needs a good comment and also a test case explaining why the generic approach does not work.

lld/test/ELF/gc-sections-debuginfo.s
5

I think James wanted you to add a double space (identation) before llvm-mc:

# RUN: echo '.section .text,"a"; .byte 0;  .section .debug_foo,"",@progbits; .quad .text' \
# RUN:   | llvm-mc --filetype=obj --arch=x86-64 - -o %t
jhenderson added inline comments.Fri, Mar 29, 2:55 AM
lld/ELF/InputSection.cpp
835

Sorry, one more comment here, now that I read it again. Change:
"... (--gc-sections) ..." to ", e.g. due to --gc-sections,"

COMDATs are removed even without --gc-sections, for example.

lld/test/ELF/gc-sections-debuginfo.s
5

Yes, exactly this. It makes it a little easier to read this as a continuation.

avl updated this revision to Diff 192807.Fri, Mar 29, 4:57 AM

added indentation and corrected sections string.

Thanks. I have no more comments. I think this would be a good thing to do, but it definitely needs some wider agreement from @ruiu and the other LLD contributors.

ruiu added inline comments.Mon, Apr 8, 11:33 PM
lld/ELF/InputSection.cpp
840

Where in this document can I find a description why UINT64_MAX is special? Is using UINT64_MAX-1 as an "invalid value" just a local convention of lld (i.e. are you inventing a new notion) or a common practice? Is there any more common value representing an invalid address than UINT64_MAX-1?

avl marked an inline comment as done.Mon, Apr 8, 11:41 PM
avl added inline comments.
lld/ELF/InputSection.cpp
840

It is based on the comment by @jhenderson in start of this thread. He mentioned that -1 has issues in .debug_ranges. I would find the real description of using -1 for .debug_ranges and put here...

ruiu added inline comments.Mon, Apr 8, 11:45 PM
lld/ELF/InputSection.cpp
840

It's fine if -1 has a special meaning, but it doesn't necessarily mean -2 is the second best. For example, why don't you use 0? What's special about -2? Or is it just arbitrary?

avl marked an inline comment as done.Mon, Apr 8, 11:53 PM
avl added inline comments.
lld/ELF/InputSection.cpp
840

We need to prevent address clashing for debug info referenced to valid sections and to deleted sections. To achieve that it is necessary to set an address which is out of module scope. So the idea is to set maximal allowed address value. UINT64_MAX is maximal value but used by .debug_aranges. UINT64_MAX -1 is the next maximal value.

address range 0...0+LENGTH could clash with existing range.
address range UINT64_MAX -1...UINT64_MAX -1+LENGTH is unlikely to be clashed.

ruiu added a comment.Tue, Apr 9, 12:56 AM

I thought that address 0 is a good candidate to represent an invalid address, as nullptr is usually represented by address 0. I wonder why we want to use UINT64_MAX-2 instead of 0.

MaskRay requested changes to this revision.EditedTue, Apr 9, 1:57 AM

lld sets x86_64 image base to 0x200000.
A .debug_info associated with a non-discarded section has DW_AT_low_pc > 0x200000
A .debug_info associated with a discarded section has DW_AT_low_pc = 0. If its size is larger than 0x200000 => DW_AT_high_pc > image_base => the two debug info entries overlap => llvm-symbolizer locates a DIE that's not what the user expects => unexpected filename (not_used.cpp)

If you make ".rept 2000000; nop; .endr" larger, you'll notice bfd linked executable has the same problem (its image base is 0x400000).

InputSection.cpp:833
Target->relocateOne(BufLoc, Type, SignExtend64<Bits>(Sym.getVA(Addend)));

I think the current handling in lld is quite good, no need to special case on if (!Sym.getOutputSection()). The same problem can be reproduced for ld.bfd and gold.

I think the right fix is to improve the heuristics used in llvm-symbolizer, rather than change lld. llvm-symbolizer may deal with executables linked by ld.bfd and gold.

One change we can make is that, in DWARFDebugAranges::extract, if LowPC is zero and the file is not a relocatable object, don't call appendRange(CUOffset, LowPC, HighPC);.

This revision now requires changes to proceed.Tue, Apr 9, 1:57 AM

A workaround is to set a larger image base with --image-base=.

One change we can make is that, in DWARFDebugAranges::extract, if LowPC is zero and the file is not a relocatable object, don't call appendRange(CUOffset, LowPC, HighPC);.

I'm not sure if making zero DW_AT_low_pc special for executable/DSO in DWARFDebugAranges is a good idea.

avl added a comment.Tue, Apr 9, 3:55 AM

I thought that address 0 is a good candidate to represent an invalid address, as nullptr is usually represented by address 0. I wonder why we want to use UINT64_MAX-2 instead of 0.

using 0 in this case could lead to following problems :

  1. on some platforms 0 is a valid address. i.e. image base starts from 0. In that case no way to differentiate valid address from invalid one just by checking for 0.
  2. if size of deleted section is long enough then debug range could clash with valid section address range. That leads to incorrect debug info selecting. As written in another message :

lld sets x86_64 image base to 0x200000.
A .debug_info associated with a non-discarded section has DW_AT_low_pc > 0x200000
A .debug_info associated with a discarded section has DW_AT_low_pc = 0. If its size is larger than 0x200000 => DW_AT_high_pc > image_base => the two debug info entries overlap => llvm->symbolizer locates a DIE that's not what the user expects => unexpected filename (not_used.cpp)

Using UINT64_MAX-1 prevents from above problems and allow to select correct DWARF entry.

in other words: using 0 as start of address range could lead to overlapping with valid address range.
using UINT64_MAX-1 as start of address range makes address range of deleted section be outside of module address space and would not clash with existing ranges.

avl added a comment.Tue, Apr 9, 4:08 AM

I think the current handling in lld is quite good, no need to special case on if (!Sym.getOutputSection()). The same problem can be reproduced for ld.bfd and gold.

I think the right fix is to improve the heuristics used in llvm-symbolizer, rather than change lld. llvm-symbolizer may deal with executables linked by ld.bfd and gold.

But there is no way to heuristically detect this case which works generally. On some platforms image_base could start from 0. llvm-symbolizer just does not have enough information to decide whether address range correct or not. At the same time linker does have such information. So it looks correct to do that processing in linker. Linker knows that section deleted and able to set not matching address while resolving relocations pointed to this section.

Good general solution would be to delete not used debug info.

But for present time when that information exist - using UINT_MAX64-1 is a good way to resolve ambiquity. It is better than some heuristic which does not work for all cases.

avl added a comment.Tue, Apr 9, 4:32 AM

I'm not sure if making zero DW_AT_low_pc special for executable/DSO in DWARFDebugAranges is a good idea.

That could be seen not as special value but as a value which does not clash with other ranges. For example Linker could(as an imaginary processing) assign address to the section and delete it after address assigned. In that case relocation could be resolved into this not overlapped area. Using UINT64_MAX-1 is similar to that handling. The difference is only that it simplified by taking maximal address value instead of assign an unused area.

__libc_csu_init is in libc.a(elf-init.o) and has no debug info. It should symbolize to ?? but not_used.cpp is incorrectly returned. ?? -> not_used.cpp

A more serious problem is when a main.cpp symbol incorrectly resolves to not_used.cpp. I created D60470 to fix that.

avl added a comment.Tue, Apr 9, 12:31 PM

__libc_csu_init is in libc.a(elf-init.o) and has no debug info. It should symbolize to ?? but not_used.cpp is incorrectly returned. ?? -> not_used.cpp

I did not see that problem if this fix applied. Everything looks exactly like you expect:

nm res.out
0000000000201100 T __libc_csu_init

U __libc_start_main

00000000002010f0 T main

llvm-symbolizer -obj=res.o 0x0000000000201100
__libc_csu_init
??:0:0 <<<<<<<<<<<<<<<<<<<<<<<<<<<

llvm-symbolizer -obj=res.o 0x00000000002010f0
main
main.cpp:2:4 <<<<<<<<<<<<<<<<<<<<<<<<<<<

Did I correctly understand the problem which you are talking about?

A more serious problem is when a main.cpp symbol incorrectly resolves to not_used.cpp. I created D60470 to fix that.

Would solution from D60470 work correctly for platform where 0 is a valid virtual address ?

MaskRay resigned from this revision.Tue, Apr 9, 8:06 PM
In D59553#1460304, @avl wrote:

__libc_csu_init is in libc.a(elf-init.o) and has no debug info. It should symbolize to ?? but not_used.cpp is incorrectly returned. ?? -> not_used.cpp

I did not see that problem if this fix applied. Everything looks exactly like you expect:

nm res.out
0000000000201100 T __libc_csu_init

U __libc_start_main

00000000002010f0 T main

llvm-symbolizer -obj=res.o 0x0000000000201100
__libc_csu_init
??:0:0 <<<<<<<<<<<<<<<<<<<<<<<<<<<

llvm-symbolizer -obj=res.o 0x00000000002010f0
main
main.cpp:2:4 <<<<<<<<<<<<<<<<<<<<<<<<<<<

Did I correctly understand the problem which you are talking about?

A more serious problem is when a main.cpp symbol incorrectly resolves to not_used.cpp. I created D60470 to fix that.

Would solution from D60470 work correctly for platform where 0 is a valid virtual address ?

See my comment at https://reviews.llvm.org/D60470#1460670.

MaskRay added inline comments.Tue, Apr 9, 8:10 PM
lld/ELF/InputSection.cpp
840

I think -1 also works, no need to use -2. -1 is used somewhere in llvm/lib/DebugInfo/DWARF/ to indicate "the value is absent", exactly what we want to express here.

ruiu added a comment.Tue, Apr 9, 11:39 PM
In D59553#1459554, @avl wrote:

I thought that address 0 is a good candidate to represent an invalid address, as nullptr is usually represented by address 0. I wonder why we want to use UINT64_MAX-2 instead of 0.

using 0 in this case could lead to following problems :

  1. on some platforms 0 is a valid address. i.e. image base starts from 0. In that case no way to differentiate valid address from invalid one just by checking for 0.

But this is also true to UINT64_MAX-2 because you can put your code or data there, no?

  1. if size of deleted section is long enough then debug range could clash with valid section address range. That leads to incorrect debug info selecting. As written in another message :

lld sets x86_64 image base to 0x200000.
A .debug_info associated with a non-discarded section has DW_AT_low_pc > 0x200000
A .debug_info associated with a discarded section has DW_AT_low_pc = 0. If its size is larger than 0x200000 => DW_AT_high_pc > image_base => the two debug info entries overlap => llvm->symbolizer locates a DIE that's not what the user expects => unexpected filename (not_used.cpp)

Using UINT64_MAX-1 prevents from above problems and allow to select correct DWARF entry.

in other words: using 0 as start of address range could lead to overlapping with valid address range.
using UINT64_MAX-1 as start of address range makes address range of deleted section be outside of module address space and would not clash with existing ranges.

avl added a comment.Wed, Apr 10, 3:12 AM
In D59553#1459554, @avl wrote:
  1. on some platforms 0 is a valid address. i.e. image base starts from 0. In that case no way to differentiate valid address from invalid one just by checking for 0.

But this is also true to UINT64_MAX-2 because you can put your code or data there, no?

yes. that`s true. But using UINT64_MAX-1 to assign addresses looks more like an artificial case. It is natural to start assign addresses from start of address space(which could be 0). It is much less common practice(AFAIK) to assign addresses in the very end of address space.

In the very first version of that path there was a code which analyzes sections for assigned addresses and select an address which does not clash with already assigned module`s ranges. It was suggested to simplify this processing to use UINT64_MAX-1 instead of analyzing address ranges. Assuming that it is really rare case when UINT64_MAX-1 would clash with existing module`s ranges. So using UINT64_MAX-1 is "simple, fast, good enough" solution which works better than 0. i.e. there would be much less cases when addresses would clash for UINT64_MAX-1 than for 0.

avl marked an inline comment as done.Wed, Apr 10, 3:20 AM
avl added inline comments.
lld/ELF/InputSection.cpp
840

-1 used as a special value indicating "base address selection entry" in debug range list processing. Since we do not need to indicate "base address selection entry" here then it would probably be better to use -2.

avl added a comment.Fri, Apr 12, 12:46 AM

@MaskRay Hi, Since you changed status of review into "Needs Revision" - What changes do you think should be done ?

avl requested review of this revision.Mon, Apr 15, 1:56 PM
MaskRay added a comment.EditedMon, Apr 15, 10:23 PM
In D59553#1463805, @avl wrote:

@MaskRay Hi, Since you changed status of review into "Needs Revision" - What changes do you think should be done ?

My thoughts about this change are split. I wanted to reset the status of review into "Normal" but I didn't how to do that :( I didn't want to change it to "Accepted" so I chose "Resign". It seems it didn't restore the status back to normal and only you can do that...

What I don't like

  • complexity. Two lines are two lines.
  • The demonstration case is a bit contrived. Filling in a garbage collected text section with zeros/nops to overlap with another text section doesn't seem common. The problems aren't new but lld's layout mitigates many(most?) of the problems. See https://reviews.llvm.org/D60470#1462246
  • Existing tools have learned various heuristics to avoid 0 DW_AT_low_pc collision, but I am not sure what they'd do if those relocations resolve to -1. llvm-symbolizer is certainly good (I have checked a few examples - I'm learning it recently), but what about other tools?

What I like

  • -1 or -2 makes the resolved relocation outstanding -> a different value from 0 makes it more obvious. Avoiding -1 just because "base address selection entry" uses -1 is not a strong argument IMHO. As long as there ranges are invalid (or valid but unrealistic to collide with good ones) we are good.
MaskRay added a reviewer: MaskRay.
avl added a comment.Mon, Apr 15, 10:49 PM

@ruiu What do you think about using UINT64_MAX-1 as address to which relocations for deleted sections would be resolved ? Does it make sense ?

avl added a comment.Mon, Apr 15, 11:25 PM

Existing tools have learned various heuristics to avoid 0 DW_AT_low_pc collision, but I am not sure what they'd do if those relocations resolve to -1. llvm-symbolizer is certainly good (I have checked a few >examples - I'm learning it recently), but what about other tools?

addr2line started to behave correctly :

[LD]clang++ -gdwarf-5 -O -Wl,--gc-sections -fuse-ld=ld not_used.o main.o -o res.out
addr2line -e res.out 0x400497
/home/avl/bugs/gc_debuginfo/not_used.cpp:2 <<<<<<<<<<<<<<<<<<<<<<<<<< incorrect if built by ld

[LLD]clang++ -gdwarf-5 -O -Wl,--gc-sections -fuse-ld=lld not_used.o main.o -o res.out
addr2line -e res.out 0x2010e7
/home/avl/bugs/gc_debuginfo/main.cpp:3 <<<<<<<<<<<<<<<<<<<<<<< correct if built by lld with this patch

avl added a comment.Mon, Apr 22, 7:21 AM

I find a previous discussion at https://bugs.llvm.org/show_bug.cgi?id=37212 and https://reviews.llvm.org/D46502

this is another case(other than -ffunction-sections) where using UINT64_MAX-1 as address value for debug info related to discarded sections would help to dump debug info more correct.