Page MenuHomePhabricator

[WIP][LLD][ELF][DebugInfo] Use predefined value to mark debug address ranges pointing to deleted code.
AbandonedPublic

Authored by avl on Mar 19 2019, 10:34 AM.

Details

Summary
There are several problems connected to debug address ranges pointing
to code deleted by --gc-sections:

https://bugs.llvm.org/show_bug.cgi?id=41124

When -Wl,--gc-sections is used: unreferenced .text sections could be removed,
but debug info related to these deleted sections left unchanged.
Address ranges for this not needed debug info could overlap with
address ranges for correct code. In such case symbolizer shows
line information incorrectly.

http://lists.llvm.org/pipermail/llvm-dev/2020-May/141885.html

When -Wl,--gc-sections is used: unreferenced .text sections could be removed,
but debug info related to these deleted sections left unchanged.
Relocations related to debug address ranges are resolved to 0 in that case.
That could lead to unexpected end of range list inside .debug_ranges.
(for zero length functions).

That fix uses special predefined value to mark such address ranges.
It resolves relocations related to debug info for deleted code
to special vale UINT64_MAX-1(for .debug_ranges and .debug_loc)
and UINT64_MAX(for other .debug* tables). So that DWARF consumers
could recognize these invalid address ranges and handle them properly.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
avl added a comment.May 17 2019, 9:24 AM

@MaskRay

actually other DWARF consumers are not happy.

Example please.

please check following behavior of lldb, llvm-symbolizer, gnu addr2line, gnu objdump :

$ cat main.cpp
void foo_used();

int main(void) {

foo_used();
return 0;

}
$ cat funcs.cpp
void foo_not_used () {

__asm__(".rept 2105344; nop; .endr");

}

void foo_used () {

__asm__(".rept 10000; nop; .endr");

}

$ clang++ -gdwarf-4 -O funcs.cpp -ffunction-sections -c
$ clang++ -gdwarf-4 -O main.cpp -ffunction-sections -c
$ clang++ -gdwarf-4 -O funcs.o main.o -fuse-ld=lld -Wl,--gc-sections -o res.out
$ lldb res.out
(lldb) disassemble -name main
res.out`main:
res.out[0x203810] <+0>: pushq %rax
res.out[0x203811] <+1>: callq 0x2010f0 ; foo_not_used + 240
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
res.out[0x203816] <+6>: xorl %eax, %eax
res.out[0x203818] <+8>: popq %rcx
res.out[0x203819] <+9>: retq

(lldb) b foo_used
Breakpoint 1: where = res.out`foo_not_used() + 240, address = 0x00000000002010f0
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

$ llvm-symbolizer -obj=res.out 0x00000000002010f0
foo_used()
/home/avl/bugs/gc_debuginfo/funcs.cpp:2:5
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

$ addr2line -e res.out 0x00000000002010f0
/home/avl/bugs/gc_debuginfo/funcs.cpp:2
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

$ objdump -d -S res.out
0000000000201000 <_start>:
void foo_not_used () {

__asm__(".rept 2105344; nop; .endr");

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

201000:       31 ed                   xor    %ebp,%ebp
201002:       49 89 d1                mov    %rdx,%r9
201005:       5e                      pop    %rsi

And this patch makes them happy:

$ lldb res.out
(lldb) disassemble -name main
res.out`main:
res.out[0x203810] <+0>: pushq %rax
res.out[0x203811] <+1>: callq 0x2010f0 ; foo_used at funcs.cpp:6:5
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
res.out[0x203816] <+6>: xorl %eax, %eax
res.out[0x203818] <+8>: popq %rcx
res.out[0x203819] <+9>: retq

(lldb) b foo_used
Breakpoint 1: where = res.out`foo_used() at funcs.cpp:6:5, address = 0x00000000002010f0
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

$ llvm-symbolizer -obj=res.out 0x00000000002010f0
foo_used()
/home/avl/bugs/gc_debuginfo/funcs.cpp:6:5
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

$ addr2line -e res.out 0x00000000002010f0
/home/avl/bugs/gc_debuginfo/funcs.cpp:6
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

$ objdump -d -S res.out
0000000000201000 <_start>:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

201000:       31 ed                   xor    %ebp,%ebp
201002:       49 89 d1                mov    %rdx,%r9
201005:       5e                      pop    %rsi
201006:       48 89 e2                mov    %rsp,%rdx
201009:       48 83 e4 f0             and    $0xfffffffffffffff0,%rsp

I checked behavior of addr2line and gdb -they are working correctly with this patch.

Can you check if gdb works without this patch?

gdb works correctly without this patch for cases when zero virtual address is not a correct value for code.
Because it uses 0 as special value indicating that address range is incorrect.
It would not work correctly for the platform where 0 is a valid vma for code.
It is generally wrong to use 0 as a special value.

Using 0 as the beginning of the section with code is not forbidden.

I want to see a concrete example where 0 is used as a valid DW_AT_low_pc.

  1. according to @jhenderson Sony has/would have in future such use case.
  2. according to this http://lists.llvm.org/pipermail/lldb-dev/2017-March/012091.html Qualcomm Kalimba DSP and the XAP RISC CPU have this use case. they even use similar to this patch approach to solve the problem.
  3. Such use case often used for embedded systems.

I personally do not have access to such systems to demonstrate a use case for you.

lld generated executable would have a problem. ld and gold generated would not.

You may check how the R and RW PT_LOAD segments are laid out in ld.bfd and gold linked modules.

For the above example :

$ clang++ -gdwarf-4 -O funcs.o main.o -fuse-ld=lld -Wl,--gc-sections -o res.out
$ addr2line -e res.out 0x00000000002010f0
/home/avl/bugs/gc_debuginfo/funcs.cpp:2
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

$ clang++ -gdwarf-4 -O funcs.o main.o -fuse-ld=ld -Wl,--gc-sections -o res.out
$ addr2line -e res.out 0x00000000004004a0
/home/avl/bugs/gc_debuginfo/funcs.cpp:6

$ clang++ -gdwarf-4 -O funcs.o main.o -fuse-ld=gold -Wl,--gc-sections -o res.out
$ addr2line -e res.out 0x0000000000400510
/home/avl/bugs/gc_debuginfo/funcs.cpp:6

lld generated binary has problem, ld and gold do not have.

avl added a comment.May 17 2019, 9:28 AM

@jhenderson

I do agree that this patching should be limited to debug sections though.
Other consumer sections might have other rules regarding special values etc, and be working on the assumption of patching to 0 is the best plan.

OK, I would change a patch accordingly and update shortly.

MaskRay added a subscriber: labath.May 18 2019, 8:43 PM

I don't see a problem with changing lldb's behavior to add special treatment for address zero. It's not the most ideal solution (that would be to avoid generating the debug info in the first place), but given that we have to accept inputs from other linkers too (which also implement this behavior), it's probably the best thing we can do now.

avl added a comment.May 20 2019, 3:01 AM

@labath

In general it is not possible to handle zero case properly in lldb. It does not have information which address range is correct.
if platform allows section to contain code starting from zero then it is not possible to detect good and bad address range:

One piece of debug info references removed address range DW_AT_low_pc=0x0 DW_AT_high_pc=0x10000
Another piece of debug info references correct address range DW_AT_low_pc=0x0 DW_AT_high_pc=0x10000

which address range should be selected ?

Solution from this patch prevents such situations. lld would not generate clashed correct address ranges at all.

Do you think it is bad to have a solution which allows to not to have clashed address ranges ?

lldb still is able to implement some heuristics for code generated by ld/gold.(which would also not work for zero-address platform)
But binaries generated by lld would be properly handled by all DWARF consumers without any change in them.
Because there would not be clashed addresses.

avl updated this revision to Diff 200318.May 20 2019, 9:47 AM

limit resolving relocations to UINT64_MAX-1 value for only debug sections, as requested.

I'm personally happy with the implementation now, but obviously there needs to be agreement from the other reviewers.

avl updated this revision to Diff 201495.May 27 2019, 4:05 AM

deleted redundant tests.

avl added a comment.Jun 2 2019, 11:58 PM

ping. @ruiu Do you think it would be useful to integrate this patch ?

this patch already has Looks Good from @jhenderson and @grimar .

It helps lldb, llvm-symbolizer, gnu addr2line, gnu objdump to show debug info properly in case clashed address ranges.

ruiu added a comment.Jun 3 2019, 12:01 AM

I'm fine with this patch. We've discussed a lot about what is a desirable value for a "null" for debug info, but looks like as long as it does not appear as a valid address, it should be fine.

Is Eric fine with this?

avl added a comment.Jun 4 2019, 7:28 AM

@echristo What is your opinion on this patch ? Is it worth to integrate it ?

avl added a comment.Jun 14 2019, 2:09 PM

@echristo Hi Eric, Could check this revision, please ? Is that patch Ok for you ?

In D59553#1557151, @avl wrote:

@avl I suspect we probably should implement the PRETEND/CB_PRETEND logic in lld instead, to properly fix https://reviews.llvm.org/D63772
I'll need to do some experiments. If we have PRETEND, we will not need the -2 hack in relocateNonAlloc

avl added a comment.Jun 26 2019, 3:15 AM

@MaskRay Hi MaskRay, PRETEND is a connected but separate issue. I agree it would be good to implement some kind of the PRETEND/CB_PRETEND logic in lld. But it does not make obsolete this patch. implementing of PRETEND logic answers the question "how to select proper debug info for COMDAT section". That patch answers the question "How to avoid overlapping address ranges in case debug info referenced to deleted code presented". This patch makes sense not only for PRETEND. Thus, I do not agree with "If we have PRETEND, we will not need the -2 hack in relocateNonAlloc". Let`s not mixup issues.

@avl, something you might want to watch out for is the behaviour of llvm-dwarfdump --verify where this patching has happened. I've seen it spit out complaints about "invalid address ranges" for this situation.

avl added a comment.Jul 4 2019, 10:42 AM

@jhenderson Thank you for pointing this out! Actually --verify complains about errors in both cases(with the fix and without the fix) for the testcase from this review:

without this patch :

error: DIE has overlapping address ranges: [0x0000000000000000, 0x0000000000000006) and [0x0000000000000000, 0x0000000000000006)
error: DIEs have overlapping address ranges:
error: DIE has overlapping address ranges: [0x0000000000000000, 0x0000000000000006) and [0x0000000000000000, 0x0000000000000006)
error: DIEs have overlapping address ranges:

with this patch :

error: Invalid address range [0xfffffffffffffffe, 0x0000000000000004)
error: Invalid address range [0xfffffffffffffffe, 0x0000000000000004)
error: Invalid address range [0xfffffffffffffffe, 0x0000000000000004)
error: Invalid address range [0xfffffffffffffffe, 0x0000000000000004)

there is not any address range to which these debug_info sections relate. This address range [0x0000000000000000, 0x0000000000000006) should probably also be reported by --verify since it less than image_base...

Probably, without eliminating broken debug_info there is no way to verify it without errors?

MaskRay added a comment.EditedJul 4 2019, 8:00 PM

PRETEND/CB_PRETEND

I thought CB_PRETEND could fix a "error on relocations to discarded SHF_ALLOC sections" issue but I fixed it in another way. Let's ignore my comment about CB_PRETEND.

error: Invalid address range [0xfffffffffffffffe, 0x0000000000000004)

DW_AT_low_pc DW_AT_high_pc pairs like these are less ideal. Before, tools have learned 0 address is special. They may or may not need a special case to handle this. Now, -2 is introduced (whatever value we choose here, GNU linkers still either relocate these to 0 or use their CB_PRETEND logic). They need to learn one more rule. I still believe, teaching the symbolizers (addr2line, llvm-symbolizer D60470) is the way forward.

avl added a comment.Jul 4 2019, 11:43 PM

error: Invalid address range [0xfffffffffffffffe, 0x0000000000000004)

DW_AT_low_pc DW_AT_high_pc pairs like these are less ideal. Before, tools have learned 0 address is special. They may or may not need a special case to handle this. Now, -2 is introduced (whatever value we choose here, GNU linkers still either relocate these to 0 or use their CB_PRETEND logic). They need to learn one more rule. I still believe, teaching the symbolizers (addr2line, llvm-symbolizer D60470) is the way forward.

the point is in that tools should NOT learn one more rule. There should not be done any additional logic for UINT64_MAX-1 in tools. the fact that llvm-dwarfdump --verify displays error message is probably normal. Because debug info references not existed address range. llvm-dwarfdump --verify dislays errors for current solution also. The proper way to fix llvm-dwarfdump --verify is to remove broken debug info.

In D59553#1571050, @avl wrote:

error: Invalid address range [0xfffffffffffffffe, 0x0000000000000004)

DW_AT_low_pc DW_AT_high_pc pairs like these are less ideal. Before, tools have learned 0 address is special. They may or may not need a special case to handle this. Now, -2 is introduced (whatever value we choose here, GNU linkers still either relocate these to 0 or use their CB_PRETEND logic). They need to learn one more rule. I still believe, teaching the symbolizers (addr2line, llvm-symbolizer D60470) is the way forward.

the point is in that tools should NOT learn one more rule. There should not be done any additional logic for UINT64_MAX-1 in tools. the fact that llvm-dwarfdump --verify displays error message is probably normal. Because debug info references not existed address range. llvm-dwarfdump --verify dislays errors for current solution also. The proper way to fix llvm-dwarfdump --verify is to remove broken debug info.

I agree. As things stand, some tools have special case behaviour for 0, but many do not, and consequently get it wrong. Furthermore, there are some cases where it is impossible to determine whether 0 is special or not. For example for PIE output, 0 can be a valid address (not all linker scripts place the headers in memory), meaning the tools can't distinguish between a function at address 0, or a dead piece of debug information. The logic for -2 is simpler, because it is almost never the case that something will appear in memory at that location, so any logic required is much simpler. In those cases where there already is special handling, it is not much harder for -2 to be added as special and will work FAR more often than 0 will, whilst it keeps logic in tools that don't have 0 special-case handling simpler.

lld/test/ELF/gc-sections-debuginfo.s
41 ↗(On Diff #201495)

Nit: looks like there's a double blank line at EOF here?

In D59553#1571050, @avl wrote:

error: Invalid address range [0xfffffffffffffffe, 0x0000000000000004)

DW_AT_low_pc DW_AT_high_pc pairs like these are less ideal. Before, tools have learned 0 address is special. They may or may not need a special case to handle this. Now, -2 is introduced (whatever value we choose here, GNU linkers still either relocate these to 0 or use their CB_PRETEND logic). They need to learn one more rule. I still believe, teaching the symbolizers (addr2line, llvm-symbolizer D60470) is the way forward.

the point is in that tools should NOT learn one more rule. There should not be done any additional logic for UINT64_MAX-1 in tools. the fact that llvm-dwarfdump --verify displays error message is probably normal. Because debug info references not existed address range. llvm-dwarfdump --verify dislays errors for current solution also. The proper way to fix llvm-dwarfdump --verify is to remove broken debug info.

I agree. As things stand, some tools have special case behaviour for 0, but many do not, and consequently get it wrong. Furthermore, there are some cases where it is impossible to determine whether 0 is special or not. For example for PIE output, 0 can be a valid address (not all linker scripts place the headers in memory), meaning the tools can't distinguish between a function at address 0, or a dead piece of debug information.

I disagree. For PIE output, 0 (__ehdr_start) points to the ELF header. It cannot be a valid code address. Note, the discussion is really about code addresses (DW_AT_low_pc, DW_AT_ranges, DW_AT_entry_pc, etc), not an unconstrained relocated file offset.

I disagree. For PIE output, 0 (__ehdr_start) points to the ELF header. It cannot be a valid code address. Note, the discussion is really about code addresses (DW_AT_low_pc, DW_AT_ranges, DW_AT_entry_pc, etc), not an unconstrained relocated file offset.

Not if the user is using linker scripts:

/* test.script */
PHDRS
{
    ph_load PT_LOAD FLAGS(0x1|0x2|0x4);
}

SECTIONS
{
    .text : {*(.text)} : ph_load
    .dynsym : { *(.dynsym) }
}

bar.c:

int bar(int x)
{
    x += 10000;
    return x ? x/12345 : 10;
}

void _start()
{
    bar(1234);
    bar(4321);
}

int foo(int x, int y)
{
    x += 10000 + y;
    y += x - 123456;

    bar(1234);
    bar(4321);


    foo(1234, 4321);
    foo(4321, 1234);

    return y ? x/12345 : 10;
}

Command-line:

clang --target=x86_64-pc-linux bar.c -c -g -ffunction-sections
ld.lld -pie bar.o -T test.script -o bar.elf --gc-sections

This results in the symbol bar being at address 0. The ELF header appears outside loadable memory. llvm-dwarfdump shows that 0 has been patched in the .debug_ranges field for both bar and foo. Running llvm-symbolizer and GNU addr2line produces mixed results:

$ llvm-symbolizer 0 0x40 0x50 0x80 0x100 -e bar.elf
bar
C:\Work\TempWork\bar.c:2:0

bar
C:\Work\TempWork\bar.c:4:5

_start
C:\Work\TempWork\bar.c:8:0

??
C:\Work\TempWork\bar.c:25:12

??
??:0:0

$ addr2line 0 0x40 0x50 0x80 0x100 -e bar.elf -f
bar
C:\Work\TempWork/bar.c:14
bar
C:\Work\TempWork/bar.c:19
_start
C:\Work\TempWork/bar.c:22
foo
C:\Work\TempWork/bar.c:25
??
??:0

Issues I noticed, some of which are possibly bugs, but it's hard to tell:

  • llvm-symbolizer reports the line number corresponding to the removed foo for the fourth case.
  • GNU addr2line line numbers are wrong in the first three cases, and foo is reported in the fourth.
  • GNU addr2line thinks foo is the right symbol for the fourth case, even though it has been removed. llvm-symbolizer just prints '??' (that's probably correct).

I haven't checked to see how many of these issues are fixed by D60470.

avl updated this revision to Diff 208160.Jul 5 2019, 5:36 AM

removed whitespaces, added handling x86 ILP32 case.

Issues I noticed, some of which are possibly bugs, but it's hard to tell:

  • llvm-symbolizer reports the line number corresponding to the removed foo for the fourth case.
  • GNU addr2line line numbers are wrong in the first three cases, and foo is reported in the fourth.
  • GNU addr2line thinks foo is the right symbol for the fourth case, even though it has been removed. llvm-symbolizer just prints '??' (that's probably correct).

I haven't checked to see how many of these issues are fixed by D60470.

With D60470, you still see the following for the fourth case:

??
C:\Work\TempWork\bar.c:25:12

It doesn't intend to fix this problem. What it tries to address, is if such discarded range overlaps with a valid normal range, the valid normal range will take priority. That change may also benefit ld.bfd and gold linked modules if their CB_PRETEND logic fails to apply. Because it also simplifies the code, I think it is still worth applying no matter the decision on this lld patch.

/* test.script */
PHDRS
{
    ph_load PT_LOAD FLAGS(0x1|0x2|0x4);
}

SECTIONS
{
    .text : {*(.text)} : ph_load
    .dynsym : { *(.dynsym) }
}

I agree in this case .text can have 0 link-time address.

I guess you probably don't need .dynsym. If this is a dynamically linked executable, it will have .interp, which is usually in the first page. Then .text will not have 0 sh_addr, and 0 will be an invalid code address. It is not common to have .dynstr after .text - it is before .text in BFD's default linker script.

For similar reasons, there should not be any .gnu.hash .dynstr .rodata .eh_frame etc. They are placed before .text by default in lld.

If you have .note* sections, you'll probably also place them in the first page. This ensures they are dumped by FreeBSD and Linux kernels when the core file size is limited.

In summary, to make 0 a valid code address in an lld linked module, lots of assumptions have to be made. These assumptions make the lld patch seem pretty much a hack for a very specialized use case.

avl added a comment.Jul 8 2019, 8:06 AM

I agree in this case .text can have 0 link-time address.

I guess you probably don't need .dynsym. If this is a dynamically linked executable, it will have .interp, which is usually in the first page. Then .text will not have 0 sh_addr, and 0 will be an invalid code address. It is not common to have .dynstr after .text - it is before .text in BFD's default linker script.

For similar reasons, there should not be any .gnu.hash .dynstr .rodata .eh_frame etc. They are placed before .text by default in lld.

If you have .note* sections, you'll probably also place them in the first page. This ensures they are dumped by FreeBSD and Linux kernels when the core file size is limited.

In summary, to make 0 a valid code address in an lld linked module, lots of assumptions have to be made. These assumptions make the lld patch seem pretty much a hack for a very specialized use case.

Having 0 as a valid code address is quite common for embedded world.
Please check that article, it describes exactly such a case https://www.embeddedrelated.com/showarticle/900.php
The linker script for that example has

MEMORY
{
  flash     (rx)   : ORIGIN = 0, LENGTH = 150K
  ram       (rw!x) : ORIGIN = 0x800000, LENGTH = 32K
}
SECTIONS
{
  .text :
  {
    ./crt0.o(.text*)
    *(.text*)
    *(.strings)
    *(._pm*)
    *(.init)
    *(.fini)
     _etext = . ;
    . = ALIGN(4);
  }  > flash

Other than that there were another mentions of " 0 as a valid code address" in this thread. Sony has such a case. Message from lldb-dev mentioned that: http://lists.llvm.org/pipermail/lldb-dev/2017-March/012091.html . Having all of that, "0 as a valid code address" does not look like very specialized use case.
It would be useful to support it, especially for embedded applications.

avl added a comment.Jul 11 2019, 2:16 PM

I'm fine with this patch. We've discussed a lot about what is a desirable value for a "null" for debug info, but looks like as long as it does not appear as a valid address, it should be fine.

Is Eric fine with this?

@ruiu , It looks like @echristo does not follow this review. Would you mind to approve it, please ?

I am following, I've just been out for 3 weeks. I'm catching up now and this is in my queue.

avl added a comment.Jul 11 2019, 2:28 PM

Thank you, I apologize for impatience.

avl added a comment.Aug 13 2019, 8:33 AM

@echristo @aprantl @probinson

Folks, I would like to ask a question related to this review and DWARF standard:

DWARF5 standard has the following statement related to non-continuous address ranges:

2.17.3 Non-Contiguous Address Ranges
...
A bounded range entry whose beginning and ending address offsets are equal
(including zero) indicates an empty range and may be ignored.

Would not it be useful to extend it to explicitly explain the following case
(for some next version of the standard)

"A bounded range entry whose beginning address offset greater than ending address offset
indicates an invalid range and may be ignored. "

?

This situation (LowPC > HighPC) already represents an invalid range.

llvm-dwarfdump --verify :
error: Invalid address range [0xfffffffffffffffe, 0x0000000000000004)

Thus it would be useful to indicate that explicitly and describe that default handling would be to skip.
So that broken ranges do not affect valid ranges.

For this patch, it would mean that practically all cases
would lead in this situation, and DWARF readers would adequately skip them :

LowPC -> points to deleted section -> set to UINT64_MAX-1; 
HighPC -> points to deleted section -> set to UINT64_MAX-1; 

ignored.
LowPC -> points to deleted section -> set to UINT64_MAX-1; 
HighPC -> points to correct section -> set to some value less than UINT64_MAX-1;

ignored.
LowPC -> points to deleted section -> set to UINT64_MAX-1; 
Length has some value -> HighPC = UINT64_MAX-1 + value -> overflow -> less than LowPC

ignored.

What do you think?

avl updated this revision to Diff 267547.May 31 2020, 11:52 PM
avl edited the summary of this revision. (Show Details)

rebased. updated according to the discussion on llvm-dev.
It is currently being discussed, so it could be changed more.

avl retitled this revision from [LLD][ELF][DebugInfo] llvm-symbolizer shows incorrect source line info if --gc-sections used to [WIP][LLD][ELF][DebugInfo] Use predefined value to mark debug address ranges pointing to deleted code..May 31 2020, 11:54 PM
avl edited the summary of this revision. (Show Details)

(again, thanks always for your ongoing patience on this complicated/long-term issue)

The flag name for disabling this seems a little confusing to me - since there's no option to really "disable" this sort of marking, but different choices in what value is used. So maybe something like "--dead-reloc-zero-addend" or something (don't go changing the patch on my bad suggestion here, but perhaps @MaskRay & others on this thread might have better ideas for names based on linker flag name history). I wouldn't worry about debating/changing this urgently - just a thought to ponder while we continue with the rest of the discussion.

(again, thanks always for your ongoing patience on this complicated/long-term issue)

The flag name for disabling this seems a little confusing to me - since there's no option to really "disable" this sort of marking, but different choices in what value is used. So maybe something like "--dead-reloc-zero-addend" or something (don't go changing the patch on my bad suggestion here, but perhaps @MaskRay & others on this thread might have better ideas for names based on linker flag name history). I wouldn't worry about debating/changing this urgently - just a thought to ponder while we continue with the rest of the discussion.

Perhaps if we're adding the option, the option value should be configurable, so something like --dead-reloc-addend=<value> which specifies the value instead of just assuming zero. You could even go further e.g. --dead-reloc-addend=<section>=<value> (your syntax may vary) to allow configuring it per section. Thus, the default would be something like --dead-reloc-addend=.debug_ranges=0xfffffffffffffffe --dead-reloc-addend=.debug_info=0xffffffffffffffff and so on. This would allow users to configure things according to the needs of their individual system.

(again, thanks always for your ongoing patience on this complicated/long-term issue)

The flag name for disabling this seems a little confusing to me - since there's no option to really "disable" this sort of marking, but different choices in what value is used. So maybe something like "--dead-reloc-zero-addend" or something (don't go changing the patch on my bad suggestion here, but perhaps @MaskRay & others on this thread might have better ideas for names based on linker flag name history). I wouldn't worry about debating/changing this urgently - just a thought to ponder while we continue with the rest of the discussion.

Perhaps if we're adding the option, the option value should be configurable, so something like --dead-reloc-addend=<value> which specifies the value instead of just assuming zero. You could even go further e.g. --dead-reloc-addend=<section>=<value> (your syntax may vary) to allow configuring it per section. Thus, the default would be something like --dead-reloc-addend=.debug_ranges=0xfffffffffffffffe --dead-reloc-addend=.debug_info=0xffffffffffffffff and so on. This would allow users to configure things according to the needs of their individual system.

Doable if necessary (will you likely have a need for this in PS4? that'd make it easier to justify adding the feature) - but I'd try to avoid adding such flexibility for as long as possible to avoid proliferation of even more variants of debug info consumers might then have to deal with.

(again, thanks always for your ongoing patience on this complicated/long-term issue)

The flag name for disabling this seems a little confusing to me - since there's no option to really "disable" this sort of marking, but different choices in what value is used. So maybe something like "--dead-reloc-zero-addend" or something (don't go changing the patch on my bad suggestion here, but perhaps @MaskRay & others on this thread might have better ideas for names based on linker flag name history). I wouldn't worry about debating/changing this urgently - just a thought to ponder while we continue with the rest of the discussion.

Perhaps if we're adding the option, the option value should be configurable, so something like --dead-reloc-addend=<value> which specifies the value instead of just assuming zero. You could even go further e.g. --dead-reloc-addend=<section>=<value> (your syntax may vary) to allow configuring it per section. Thus, the default would be something like --dead-reloc-addend=.debug_ranges=0xfffffffffffffffe --dead-reloc-addend=.debug_info=0xffffffffffffffff and so on. This would allow users to configure things according to the needs of their individual system.

Doable if necessary (will you likely have a need for this in PS4? that'd make it easier to justify adding the feature) - but I'd try to avoid adding such flexibility for as long as possible to avoid proliferation of even more variants of debug info consumers might then have to deal with.

As far as I know, we don't have any need to change from the -2/-1 values, so no specific justification from that point of view. I'm mostly thinking of weird cases where users might have to use addresses in the space we're using for whatever reason (signed addresses/wrapped address space etc).

avl added a comment.Jun 8 2020, 5:20 AM

Do we have agreement on following option syntax;

--dead-reloc-addend=<value>

?

If yes - I would update the patch.

In D59553#2079451, @avl wrote:

Do we have agreement on following option syntax;

--dead-reloc-addend=<value>

?

If yes - I would update the patch.

I'd like to avoid adding a flag at all, if possible.

Is there already a clear case where a one-size-fits-all solution won't be sufficient?

avl added a comment.Jun 8 2020, 10:59 AM

I'd like to avoid adding a flag at all, if possible.

without the flag there is a risk that some tools expecting 0(current behavior) would work incorrectly with -1/-2.
I do not know whether that problem really exist. But, since we are changing current default behavior, that problem could arise.
Thus, the flag which could restore old behavior - looks useful.

Is there already a clear case where a one-size-fits-all solution won't be sufficient?

I did not clearly get it. is "one-size-fits-all" means option which affects all tables(without special cases for debug_loc/debug_ranges)?

In D59553#2080357, @avl wrote:

I'd like to avoid adding a flag at all, if possible.

without the flag there is a risk that some tools expecting 0(current behavior) would work incorrectly with -1/-2.
I do not know whether that problem really exist. But, since we are changing current default behavior, that problem could arise.
Thus, the flag which could restore old behavior - looks useful.

Yeah, I just worry then it becomes another source of inconsistency between consumers/producers/etc. Perhaps we could name it in some way that's clear that it might be removed later - to encourage people to either reach out to us to explain their use case, or to invest a bit more in working with the new default behavior.

Is there already a clear case where a one-size-fits-all solution won't be sufficient?

I did not clearly get it. is "one-size-fits-all" means option which affects all tables(without special cases for debug_loc/debug_ranges)?

Sorry, by "one-size-fits-all" I just meant "without a flag to customize the behavior". ie: do we know of specific use cases where the proposed -1/-2 behavior is definitely not going to be usable (that would make the flag discussion easier/more clear)?

avl added a comment.Jun 8 2020, 11:37 AM

Yeah, I just worry then it becomes another source of inconsistency between consumers/producers/etc. Perhaps we could name it in some way that's clear that it might be removed later - to encourage people to either reach out to us to explain their use case, or to invest a bit more in working with the new default behavior.

--dead-reloc-obsolete-addend

or

--dead-reloc-obsolete

? (yes/no option).

Sorry, by "one-size-fits-all" I just meant "without a flag to customize the behavior". ie: do we know of specific use cases where the proposed -1/-2 behavior is definitely not going to be usable (that would make the flag discussion easier/more clear)?

I see, now. Thank you. I do not know such cases.

In D59553#2080495, @avl wrote:

Yeah, I just worry then it becomes another source of inconsistency between consumers/producers/etc. Perhaps we could name it in some way that's clear that it might be removed later - to encourage people to either reach out to us to explain their use case, or to invest a bit more in working with the new default behavior.

--dead-reloc-obsolete-addend

or

--dead-reloc-obsolete

? (yes/no option).

Not really sure - sometimes flags use the word "experimental" but that more sounds like it might become a real flag in the future, whereas the hope is that it's the other direction. The only other mechanism I've seen is the really verbose flag " -enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang" - but I don't think that's really necessary here. Probably "experimental" (eg: --experimental-dead-reloc (I think 'experimental' usually goes at the start - but you could check other flags for inspiration)) is enough, if we are going to have a flag at all.

In D59553#1627199, @avl wrote:

@echristo @aprantl @probinson

Folks, I would like to ask a question related to this review and DWARF standard:

DWARF5 standard has the following statement related to non-continuous address ranges:

2.17.3 Non-Contiguous Address Ranges
...
A bounded range entry whose beginning and ending address offsets are equal
(including zero) indicates an empty range and may be ignored.

Would not it be useful to extend it to explicitly explain the following case
(for some next version of the standard)

"A bounded range entry whose beginning address offset greater than ending address offset
indicates an invalid range and may be ignored. "

?

This situation (LowPC > HighPC) already represents an invalid range.

llvm-dwarfdump --verify :
error: Invalid address range [0xfffffffffffffffe, 0x0000000000000004)

Thus it would be useful to indicate that explicitly and describe that default handling would be to skip.
So that broken ranges do not affect valid ranges.

For this patch, it would mean that practically all cases
would lead in this situation, and DWARF readers would adequately skip them :

LowPC -> points to deleted section -> set to UINT64_MAX-1; 
HighPC -> points to deleted section -> set to UINT64_MAX-1; 

ignored.
LowPC -> points to deleted section -> set to UINT64_MAX-1; 
HighPC -> points to correct section -> set to some value less than UINT64_MAX-1;

ignored.
LowPC -> points to deleted section -> set to UINT64_MAX-1; 
Length has some value -> HighPC = UINT64_MAX-1 + value -> overflow -> less than LowPC

ignored.

What do you think?

Answering this very old question:

Case 1 is a zero-length range and would always be interpreted that way.
Case 2 should not occur, I think; the compiler should not produce a range where low/high point to different sections.
Case 3 we are hoping will be detected and treated as an invalid range by consumers.

There is an additional case to consider, which is this: Ranges can be specified (and in LLVM's case, now usually are) using a "base address" and then the low/high values are *both* offsets from that base address. We would want consumers to notice the base address was a special "invalid" or "points to nothing" value, and ignore any range that used low/high offsets from that base address.

avl added a comment.Jun 8 2020, 2:04 PM

Case 2 should not occur, I think; the compiler should not produce a range where low/high point to different sections.

It could occur when discarded section followed with the valid section.
The range for discarded section is [LowPC,HighPC). HighPC points to the next byte after specified range.
Thus HighPC could be valid address - LowPC of next range.

Case 3 we are hoping will be detected and treated as an invalid range by consumers.

Probably, it makes sense to describe it directly in the DWARF standard?
The same how it is already spoken for the zero-length ranges:

"A bounded range entry whose beginning and ending address offsets are equal
(including zero) indicates an empty range and may be ignored."

For LowPC>HighPC it could sound similarly:

"A bounded range entry whose beginning address offset greater than ending address offset
indicates an invalid range and may be ignored."

Thus, above situation would be explicitly described and then should be processed accordingly by consumers.

There is an additional case to consider, which is this: Ranges can be specified (and in LLVM's case, now usually are) using a "base address" and then the low/high values are *both* offsets from that base address. We would want consumers to notice the base address was a special "invalid" or "points to nothing" value, and ignore any range that used low/high offsets from that base address.

yes. So we need more rules: Base address is being marked with -1/-2 according to last discussions.
So DWARF standard should explicitly describe that all ranges using base address of -1/-2 value become invalid.
Right?

In D59553#2080941, @avl wrote:

Case 2 should not occur, I think; the compiler should not produce a range where low/high point to different sections.

It could occur when discarded section followed with the valid section.
The range for discarded section is [LowPC,HighPC). HighPC points to the next byte after specified range.
Thus HighPC could be valid address - LowPC of next range.

It's /possible/ for a compiler to use such code to encode a low/high range, but seems quite unlikely that a compiler would do that in practice (compared to using an offset relative to the section the code is in). I think it'd be OK to ignore that case for now/suggest that if a compiler does such a thing it's up to that compiler to figure out how to get the tombstone value.

But I'm also OK saying that if either the low or high pc have the tombstone value, the range is invalid - so a produce could, in theory do what you're describing. But seems unlikely they ever would.

Case 3 we are hoping will be detected and treated as an invalid range by consumers.

Probably, it makes sense to describe it directly in the DWARF standard?
The same how it is already spoken for the zero-length ranges:

"A bounded range entry whose beginning and ending address offsets are equal
(including zero) indicates an empty range and may be ignored."

For LowPC>HighPC it could sound similarly:

"A bounded range entry whose beginning address offset greater than ending address offset
indicates an invalid range and may be ignored."

Thus, above situation would be explicitly described and then should be processed accordingly by consumers.

There is an additional case to consider, which is this: Ranges can be specified (and in LLVM's case, now usually are) using a "base address" and then the low/high values are *both* offsets from that base address. We would want consumers to notice the base address was a special "invalid" or "points to nothing" value, and ignore any range that used low/high offsets from that base address.

yes. So we need more rules: Base address is being marked with -1/-2 according to last discussions.
So DWARF standard should explicitly describe that all ranges using base address of -1/-2 value become invalid.
Right?

I think the thing for the DWARF spec to do would be to say a fundamental address of -1 (the DWARFv6 spec won't need to talk about -2, because that'll just be a compatibility issue for DWARFv4 and below - and DWARF doesn't issue fixes to old specs - so it'll be just an implementation detail between DWARF producers and consumers) is the tombstone/reserved value to describe dead code and that any arithmetic on that value is ignored (so you can't use ((address)-1) + 5 to reach a valid address -1 remains -1 or otherwise tracks invalid through any arithmetic).

In D59553#2080495, @avl wrote:

Yeah, I just worry then it becomes another source of inconsistency between consumers/producers/etc. Perhaps we could name it in some way that's clear that it might be removed later - to encourage people to either reach out to us to explain their use case, or to invest a bit more in working with the new default behavior.

--dead-reloc-obsolete-addend

or

--dead-reloc-obsolete

? (yes/no option).

Not really sure - sometimes flags use the word "experimental" but that more sounds like it might become a real flag in the future, whereas the hope is that it's the other direction. The only other mechanism I've seen is the really verbose flag " -enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang" - but I don't think that's really necessary here. Probably "experimental" (eg: --experimental-dead-reloc (I think 'experimental' usually goes at the start - but you could check other flags for inspiration)) is enough, if we are going to have a flag at all.

How about we just go with the basic --dead-reloc-addend (or similar) with no experimental or obsolete in the name? In that case, I'd make it hidden, so users can't find it in the normal help text, and add comments both in the help text (for --help-hidden, and comments in the code) and in the release notes saying it is intended as a chicken switch for compatibility reasons, and that they should contact the LLVM community if the switch is ever actually needed, but if we don't hear we will remove it in the future.

In D59553#2080495, @avl wrote:

Yeah, I just worry then it becomes another source of inconsistency between consumers/producers/etc. Perhaps we could name it in some way that's clear that it might be removed later - to encourage people to either reach out to us to explain their use case, or to invest a bit more in working with the new default behavior.

--dead-reloc-obsolete-addend

or

--dead-reloc-obsolete

? (yes/no option).

Not really sure - sometimes flags use the word "experimental" but that more sounds like it might become a real flag in the future, whereas the hope is that it's the other direction. The only other mechanism I've seen is the really verbose flag " -enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang" - but I don't think that's really necessary here. Probably "experimental" (eg: --experimental-dead-reloc (I think 'experimental' usually goes at the start - but you could check other flags for inspiration)) is enough, if we are going to have a flag at all.

How about we just go with the basic --dead-reloc-addend (or similar) with no experimental or obsolete in the name? In that case, I'd make it hidden, so users can't find it in the normal help text, and add comments both in the help text (for --help-hidden, and comments in the code) and in the release notes saying it is intended as a chicken switch for compatibility reasons, and that they should contact the LLVM community if the switch is ever actually needed, but if we don't hear we will remove it in the future.

Not sure how many people discover these things by reading the manual or release notes - once it's on a stackoverflow post all context other than the name is probably lost.

But I don't feel too strongly about this if everyone else doesn't.

I've filed a DWARF issue to reserve "-1" as the tombstone address value. I wrote it up such that consumers should ignore that address and any address derived from it (e.g., base+offset). I don't have the issue number for it yet, hopefully I'll remember to post it here when it's assigned.

Using -2 for .debug_loc/.debug_ranges would have to be a convention, but is entirely "legal" from a DWARF standards perspective.

I've filed a DWARF issue to reserve "-1" as the tombstone address value. I wrote it up such that consumers should ignore that address and any address derived from it (e.g., base+offset). I don't have the issue number for it yet, hopefully I'll remember to post it here when it's assigned.

Using -2 for .debug_loc/.debug_ranges would have to be a convention, but is entirely "legal" from a DWARF standards perspective.

Thanks so much for starting the discussion/process with the DWARF Committee! (I guess once it's got an assigned number you could link it here & we can go CC ourselves ,maybe?)

If we can avoid --dead-reloc-addend, avoid it.

If we have to add an option, I like --dead-reloc-addend=section_glob=signed_or_unsigned (http://lists.llvm.org/pipermail/llvm-dev/2020-June/142073.html )
"My mere complaint is that the relocation record is not dead, but rather its referenced symbol is dead. However, I can't think of a better name..."

One thought when I started the thread (with the same subject as the llvm-dev's) on binutils, gdb and elfutils-devel (https://sourceware.org/pipermail/binutils/2020-May/111357.html ) was to collect use cases where -1/-2 may potentially be troublesome. (I haven't seen any reply suggesting that -1/-2 may be problematic yet.)

avl added a comment.Jun 9 2020, 1:40 PM

It looks like most common opinion is to not to create an option. Will remove it and update the patch.

I've filed a DWARF issue to reserve "-1" as the tombstone address value. I wrote it up such that consumers should ignore that address and any address derived from it (e.g., base+offset). I don't have the issue number for it yet, hopefully I'll remember to post it here when it's assigned.

Using -2 for .debug_loc/.debug_ranges would have to be a convention, but is entirely "legal" from a DWARF standards perspective.

Thanks so much for starting the discussion/process with the DWARF Committee! (I guess once it's got an assigned number you could link it here & we can go CC ourselves ,maybe?)

Ah @MaskRay posted the link to the DWARF issue ( http://www.dwarfstd.org/ShowIssue.php?issue=200609.1 ) in another thread. In lieu of somewhere better to discuss it, just a couple of points to make, @probinson :

Currently if two inline functions are /identical/, then at least with the unix linkers I know of (bfd, gold, lld), to the best of my knowledge - the linker resolves relocations to either original copy to the final copy - so both CUs point to the same function. (with the complications as you pointed out of overlapping aranges, CU ranges, etc)
If the inline functions aren't identical, then the preserved copy is only pointed to by the DWARF that originally described it - that's important because different optimizations on the function could invalidate the DWARF (eg: one file claims the variable A is in register B, the other claims it is in register C - if both subprogram descriptions point to the final copy, one of them will be incorrect)
As for Identical Code Folding - I think it might actually be valuable to preserve that information and have both subprograms point to the same code (even with overlapping aranges/CU ranges/etc). But I could understand if that's a problem for some consumers - probably the sort of thing I'd be inclined to/hope DWARF allows, even if it suggests some implementations might choose not to take advantage of it.

(summary-ish:
comdat functions that are optimized differently: subprograms must not refer to the differently optimized versio
comdat functions that are optimized identically: multiple subprograms could point to the same code, but there's probably little benefit/reason to do so
identical unrelated functions: there's value in multiple subprograms pointing to the same code)

Wording-wise: I'd have thought DWARF would be a bit more generic rather than talking about how producers might produce code, or how linkers might be involved in that process - just saying formally "-1 based addresses should be treated the same way as if the attribute were not present" - if a producer, for whatever reason, wants to produce a -1 address up-front rather than only by the linker, doesn't seem like DWARF shuold care/have an opinion on that.

Ah @MaskRay posted the link to the DWARF issue ( http://www.dwarfstd.org/ShowIssue.php?issue=200609.1 ) in another thread.

I'm not aware of that other thread.

In lieu of somewhere better to discuss it, just a couple of points to make, @probinson :

Currently if two inline functions are /identical/, then at least with the unix linkers I know of (bfd, gold, lld), to the best of my knowledge - the linker resolves relocations to either original copy to the final copy - so both CUs point to the same function. (with the complications as you pointed out of overlapping aranges, CU ranges, etc)
If the inline functions aren't identical, then the preserved copy is only pointed to by the DWARF that originally described it - that's important because different optimizations on the function could invalidate the DWARF (eg: one file claims the variable A is in register B, the other claims it is in register C - if both subprogram descriptions point to the final copy, one of them will be incorrect)
As for Identical Code Folding - I think it might actually be valuable to preserve that information and have both subprograms point to the same code (even with overlapping aranges/CU ranges/etc). But I could understand if that's a problem for some consumers - probably the sort of thing I'd be inclined to/hope DWARF allows, even if it suggests some implementations might choose not to take advantage of it.

(summary-ish:
comdat functions that are optimized differently: subprograms must not refer to the differently optimized versio
comdat functions that are optimized identically: multiple subprograms could point to the same code, but there's probably little benefit/reason to do so
identical unrelated functions: there's value in multiple subprograms pointing to the same code)

No dispute with any of the above. In some respects, yeah I glossed over some things or got details wrong, but the point is: There used to be more copies of a thing, now there are fewer, what is appropriate for DWARF to say about that? Allowing a more efficient way to say "not here boss" is what matters.

Wording-wise: I'd have thought DWARF would be a bit more generic rather than talking about how producers might produce code, or how linkers might be involved in that process - just saying formally "-1 based addresses should be treated the same way as if the attribute were not present" - if a producer, for whatever reason, wants to produce a -1 address up-front rather than only by the linker, doesn't seem like DWARF shuold care/have an opinion on that.

DWARF already acknowledges the existence of COMDAT and linkers, for purposes of motivating type sections. A similar motivation seems necessary here. But as you say, it's probably not necessary to intrude that into the formal specification, and just have it say there are two ways to say it's not there (omit the attributes, or use the tombstone). I thought about moving that part out into nonnormative and probably should have followed that impulse.

If you think there would be real value to it (and knowing that the DWARF committee is not actively meeting at the moment, so it will be a while before this gets discussed), I can resubmit the issue with a cleaner writeup and mark the current one as superseded.

MaskRay added a comment.EditedJun 9 2020, 3:52 PM

Ah @MaskRay posted the link to the DWARF issue ( http://www.dwarfstd.org/ShowIssue.php?issue=200609.1 ) in another thread.

I'm not aware of that other thread.

https://sourceware.org/pipermail/binutils/2020-June/111640.html I will share results here or our llvm-dev thread if there is useful feedback.

In lieu of somewhere better to discuss it, just a couple of points to make, @probinson :

Currently if two inline functions are /identical/, then at least with the unix linkers I know of (bfd, gold, lld), to the best of my knowledge - the linker resolves relocations to either original copy to the final copy - so both CUs point to the same function. (with the complications as you pointed out of overlapping aranges, CU ranges, etc)
If the inline functions aren't identical, then the preserved copy is only pointed to by the DWARF that originally described it - that's important because different optimizations on the function could invalidate the DWARF (eg: one file claims the variable A is in register B, the other claims it is in register C - if both subprogram descriptions point to the final copy, one of them will be incorrect)
As for Identical Code Folding - I think it might actually be valuable to preserve that information and have both subprograms point to the same code (even with overlapping aranges/CU ranges/etc). But I could understand if that's a problem for some consumers - probably the sort of thing I'd be inclined to/hope DWARF allows, even if it suggests some implementations might choose not to take advantage of it.

(summary-ish:
comdat functions that are optimized differently: subprograms must not refer to the differently optimized versio
comdat functions that are optimized identically: multiple subprograms could point to the same code, but there's probably little benefit/reason to do so
identical unrelated functions: there's value in multiple subprograms pointing to the same code)

No dispute with any of the above. In some respects, yeah I glossed over some things or got details wrong, but the point is: There used to be more copies of a thing, now there are fewer, what is appropriate for DWARF to say about that? Allowing a more efficient way to say "not here boss" is what matters.

// clang -g -c a.c
// ld.lld --icf=all a.o
void foo() {} // .text.foo
void bar() {} // .text.bar , folded by ICF

Resolving a relocation referencing .text.bar to the tombstone value will make .debug_* diverge more from other non-SHF_ALLOC sections. It is not difficult to implement, though:
if (Defined *ds = dyn_cast<Defined>(&sym)) ds->section->repl == ds->section => (not folded by ICF)

Wording-wise: I'd have thought DWARF would be a bit more generic rather than talking about how producers might produce code, or how linkers might be involved in that process - just saying formally "-1 based addresses should be treated the same way as if the attribute were not present" - if a producer, for whatever reason, wants to produce a -1 address up-front rather than only by the linker, doesn't seem like DWARF shuold care/have an opinion on that.

DWARF already acknowledges the existence of COMDAT and linkers, for purposes of motivating type sections. A similar motivation seems necessary here. But as you say, it's probably not necessary to intrude that into the formal specification, and just have it say there are two ways to say it's not there (omit the attributes, or use the tombstone). I thought about moving that part out into nonnormative and probably should have followed that impulse.

If you think there would be real value to it (and knowing that the DWARF committee is not actively meeting at the moment, so it will be a while before this gets discussed), I can resubmit the issue with a cleaner writeup and mark the current one as superseded.

Ah @MaskRay posted the link to the DWARF issue ( http://www.dwarfstd.org/ShowIssue.php?issue=200609.1 ) in another thread.

I'm not aware of that other thread.

In lieu of somewhere better to discuss it, just a couple of points to make, @probinson :

Currently if two inline functions are /identical/, then at least with the unix linkers I know of (bfd, gold, lld), to the best of my knowledge - the linker resolves relocations to either original copy to the final copy - so both CUs point to the same function. (with the complications as you pointed out of overlapping aranges, CU ranges, etc)
If the inline functions aren't identical, then the preserved copy is only pointed to by the DWARF that originally described it - that's important because different optimizations on the function could invalidate the DWARF (eg: one file claims the variable A is in register B, the other claims it is in register C - if both subprogram descriptions point to the final copy, one of them will be incorrect)
As for Identical Code Folding - I think it might actually be valuable to preserve that information and have both subprograms point to the same code (even with overlapping aranges/CU ranges/etc). But I could understand if that's a problem for some consumers - probably the sort of thing I'd be inclined to/hope DWARF allows, even if it suggests some implementations might choose not to take advantage of it.

(summary-ish:
comdat functions that are optimized differently: subprograms must not refer to the differently optimized versio
comdat functions that are optimized identically: multiple subprograms could point to the same code, but there's probably little benefit/reason to do so
identical unrelated functions: there's value in multiple subprograms pointing to the same code)

No dispute with any of the above. In some respects, yeah I glossed over some things or got details wrong, but the point is: There used to be more copies of a thing, now there are fewer, what is appropriate for DWARF to say about that? Allowing a more efficient way to say "not here boss" is what matters.

For sure!

Wording-wise: I'd have thought DWARF would be a bit more generic rather than talking about how producers might produce code, or how linkers might be involved in that process - just saying formally "-1 based addresses should be treated the same way as if the attribute were not present" - if a producer, for whatever reason, wants to produce a -1 address up-front rather than only by the linker, doesn't seem like DWARF shuold care/have an opinion on that.

DWARF already acknowledges the existence of COMDAT and linkers, for purposes of motivating type sections. A similar motivation seems necessary here. But as you say, it's probably not necessary to intrude that into the formal specification, and just have it say there are two ways to say it's not there (omit the attributes, or use the tombstone). I thought about moving that part out into nonnormative and probably should have followed that impulse.

If you think there would be real value to it (and knowing that the DWARF committee is not actively meeting at the moment, so it will be a while before this gets discussed), I can resubmit the issue with a cleaner writeup and mark the current one as superseded.

Yeah, I doubt it's worth resubmitting or anything, totally your call on whether you think the finer points will be a distraction or not (you know the folks on the committee/kind of conversational dynamics/etc far better than I do) - probably something to keep in mind when it comes to finalizing the wording.

(summary-ish: comdat functions that are optimized differently: subprograms must not refer to the differently optimized version
comdat functions that are optimized identically: multiple subprograms could point to the same code, but there's probably little benefit/reason to do so
identical unrelated functions: there's value in multiple subprograms pointing to the same code)

Resolving a (relocation referencing a symbol defined in a discarded COMDAT group) to the tombstone value may be more troublesome.

clang -c -ffunction-sections a.cc  # no debug info
clang -c -ffunction-sections -g b.cc

Assuming a.cc & b.cc define the same function, should the linker resolve relocations in b.o(.debug_info) to the tombstone value? Maybe not because we'll lose all debugging information.
If both a.o & b.o have debugging information, it may be reasonable to resolve relocations in b.o(.debug_info) to the tombstone value.

(summary-ish: comdat functions that are optimized differently: subprograms must not refer to the differently optimized version
comdat functions that are optimized identically: multiple subprograms could point to the same code, but there's probably little benefit/reason to do so
identical unrelated functions: there's value in multiple subprograms pointing to the same code)

Resolving a (relocation referencing a symbol defined in a discarded COMDAT group) to the tombstone value may be more troublesome.

clang -c -ffunction-sections a.cc  # no debug info
clang -c -ffunction-sections -g b.cc

Assuming a.cc & b.cc define the same function, should the linker resolve relocations in b.o(.debug_info) to the tombstone value? Maybe not because we'll lose all debugging information.
If both a.o & b.o have debugging information, it may be reasonable to resolve relocations in b.o(.debug_info) to the tombstone value.

if it comes to that, yeah, it would be nice-to-have if one debug info copy got the non-tombstone value, but I wouldn't stress about that - it's a pretty narrow case, and would only apply with a partial-debug build (we have a lot of debug info optimizations that are on-by-default (except for lldb) that assume the whole program is built with debug info).

avl updated this revision to Diff 269874.Jun 10 2020, 9:06 AM

removed backward compatibility option.

FWIW, I wonder whether the test case would be better written with YAML rather than assembly input. It would allow for more explicitly specification of what relocations are emitted, and would allow reusing the same yaml blob using the -D option for the differences between the 32-bit and 64-bit cases, relocation types etc.

lld/ELF/InputSection.cpp
845–846

returns a special value
indicate a reference from a debug

852–853

I might well have missed something, but why this special casing specifically for X86_64? Do other 64-bit targets not have the same issue (I'm assuming the masking is required to prevent some sort of "too big to fit" error from the relocation processing).

It also looks like some of this if statement is untested (specifically the bit about EM_X86_64).

873

I struggled to interpret this variable name without looking at how it is used. Perhaps noPrefixName might be better.

932–933

to resolve such relocation -> to resolve it

lld/test/ELF/gc-sections-debuginfo.s
3 ↗(On Diff #269874)

Is there a reason you are echoing this content, rather than having it written in the test file like most tests (I assume it's the need for two different blocks of assembly)? If there is, then the test file should end with .test, rather than .s since this file currently contains no assembly.

4 ↗(On Diff #269874)

Use different file names for each of the test cases, e.g. %t.x64 and %t.i386

18 ↗(On Diff #269874)

I'd move this whole comment to the top of the file to show what is being tested. I'd also change the debuginfo part of the test name to debug-data (i.e. gc-sections-debug-data.s).

Actually, this isn't specific to --gc-sections right? The same code applies for discarded COMDATs, if I'm not mistaken? So should that be tested too?

22 ↗(On Diff #269874)

to an predefined -> to a predefined

23 ↗(On Diff #269874)

That test -> This test

24 ↗(On Diff #269874)

would be -> are

25 ↗(On Diff #269874)

. And -> and

We should explicitly test .debug_ranges as well as .debug_loc, since they are both special cased.

26 ↗(On Diff #269874)

would be -> are

64–65 ↗(On Diff #269874)

Nit: there's still a double blank line at EOF here that should be deleted.

4 ↗(On Diff #191909)

I don't know what the common format in LLD tests is off the top of my head, but I find it easier if the '|' is on the following line:

Hehe, amusing self-note looking back at this comment - I nowadays prefer the pattern the other way around, namely end one line with | \ and start the next line with a couple of extra spaces of indentation. I'm happy whichever way you wish to do it.

avl marked 2 inline comments as done.Jun 11 2020, 7:03 AM

would address all comments and rewrite test case with yaml.

lld/ELF/InputSection.cpp
852–853

I might well have missed something, but why this special casing specifically for X86_64? Do other 64-bit targets not have the same issue (I'm assuming the masking is required to prevent some sort of "too big to fit" error from the relocation processing).

right. lld reported "too big to fit" error. X86_64::relocate() function has following check:

checkUInt(loc, val, 32, rel);

It leads to this error for ILP32 case:

relocation R_X86_64_32 out of range: 18446744073709551615 is not in [0, 4294967295]; consider recompiling with -fdebug-types-section to reduce size of debug sections.

I checked these targets and they seem do not have above problem:

aarch64
aarch64_32
arm64
arm64_32
riscv64

But , I missed sparcv9 target. It has similar problem.

It also looks like some of this if statement is untested (specifically the bit about EM_X86_64).

It assumed to be tested with following line from test file:

  1. RUN: | llvm-mc --filetype=obj -triple=x86_64-unknown-linux32 - -o %t
  2. RUN: ld.lld %t -o %t2 --gc-sections

specifying x86_64 inside triple should lead to config->emachine = EM_X86_64, as far as I understand.

lld/test/ELF/gc-sections-debuginfo.s
3 ↗(On Diff #269874)

Originally I wrote this test as two files with assembly. But there was request to put it`s content into "echo". Would rewrite it with yaml, as requested.

FWIW, I wonder whether the test case would be better written with YAML rather than assembly input. It would allow for more explicitly specification of what relocations are emitted, and would allow reusing the same yaml blob using the -D option for the differences between the 32-bit and 64-bit cases, relocation types etc.

I think assembly is more readable than YAML. The assembly directive .reloc 0, R_X86_64_64, sym can encode arbitrary relocations.

avl updated this revision to Diff 270499.Jun 12 2020, 12:54 PM

addressed comments.
(not used yaml, since there is a request to use asm.
added tests for various architectures, not added test for COMDAT
sections yet)

I know that you have posted the patch for one year and spent much time on it. It would seem that I deprived the patch if I created a new one. Apologies that I just created D81784 to hopefully capture all the previous discussions. A bit more items why I started a new one:

  • We don't need to special case .zdebug_* => actually we are thinking of dropping input .zdebug_* support (I am sorry that our internal toolchain still depends on it, so I can't remove it even if the community feels that dropping it is ok).
  • R_X86_64_32 should not be checked. We should use target->symbolicRel
  • We don't need *-gc-sections-debug-data-64.s for every target. The feature is sufficiently generic - x86_64 + x86 covers most cases. symbolRel logic has separately be checked in many tests.
  • gc-sections is not a generic name. /DISCARD/, non-prevailing section group, ICF can create similar scenarios. We should use a generic test case.
  • I want to explicitly check that addend is ignored.
  • I want to a more comprehensive summary and call out to what is fixed and what isn't (non-prevailing section group).
avl added a comment.Jun 19 2020, 4:51 AM

abandoned in favor D81784

avl abandoned this revision.Jun 19 2020, 4:51 AM