There are several problems connected to debug address ranges pointing to code deleted by --gc-sections: https://bugs.llvm.org/show_bug.cgi?id=41124 When -Wl,--gc-sections is used: unreferenced .text sections could be removed, but debug info related to these deleted sections left unchanged. Address ranges for this not needed debug info could overlap with address ranges for correct code. In such case symbolizer shows line information incorrectly. http://lists.llvm.org/pipermail/llvm-dev/2020-May/141885.html When -Wl,--gc-sections is used: unreferenced .text sections could be removed, but debug info related to these deleted sections left unchanged. Relocations related to debug address ranges are resolved to 0 in that case. That could lead to unexpected end of range list inside .debug_ranges. (for zero length functions). That fix uses special predefined value to mark such address ranges. It resolves relocations related to debug info for deleted code to special vale UINT64_MAX-1(for .debug_ranges and .debug_loc) and UINT64_MAX(for other .debug* tables). So that DWARF consumers could recognize these invalid address ranges and handle them properly.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
lld/test/ELF/gc-sections-debuginfo.s | ||
---|---|---|
34 ↗ | (On Diff #191842) | I think in the given case when we just have .long/.quad relocation in the inputs we perhaps do not need any additional testing. We actually care only about first 4/8 bytes of the section and their value. For clarity, we might add a check for the section Size field perhaps though. |
lld/test/ELF/gc-sections-debuginfo.s | ||
---|---|---|
34 ↗ | (On Diff #191842) | Testing of the Size field of the section should work for this purpose. |
What does GDB make of -1/-2???
Speaking of GDB it looks like it behaives exactly the same. Please check results for the example from bug 41124 :
[LD]clang++ -gdwarf-5 -O -Wl,--gc-sections -fuse-ld=ld not_used.o main.o -o res.out
gdb res.out
(gdb) break main.cpp:1
Breakpoint 1 at 0x400497: file main.cpp, line 1.
(gdb) break not_used.cpp:1
Breakpoint 2 at 0x0: file not_used.cpp, line 1.
(gdb) list main
1 int main(void) {
2 return 0;
3 }
(gdb) list foo
Function "foo" not defined.
(gdb) list main.cpp:1
1 int main(void) {
2 return 0;
3 }
(gdb) list not_used.cpp:1
1 void foo () {
2 asm(".rept 6000000; nop; .endr");
3 }
[LLD]clang++ -gdwarf-5 -O -Wl,--gc-sections -fuse-ld=lld not_used.o main.o -o res.out
gdb res.out
(gdb) break main.cpp:1
Breakpoint 1 at 0x2010e7: file main.cpp, line 1.
(gdb) break not_used.cpp:1
Breakpoint 2 at 0xfffffffffffffffe: file not_used.cpp, line 1.
(gdb) list main
1 int main(void) {
2 return 0;
3 }
(gdb) list foo
Function "foo" not defined.
(gdb) list main.cpp:1
1 int main(void) {
2 return 0;
3 }
(gdb) list not_used.cpp:1
1 void foo () {
2 asm(".rept 6000000; nop; .endr");
3 }
Speaking of gnu addr2line it looks like it become to work more correctly :
[LD]clang++ -gdwarf-5 -O -Wl,--gc-sections -fuse-ld=ld not_used.o main.o -o res.out
addr2line -e res.out 0x400497
/home/avl/bugs/gc_debuginfo/not_used.cpp:2 <<<<<<<<<<<<<<<<<<<<<<<<<< incorrect
[LLD]clang++ -gdwarf-5 -O -Wl,--gc-sections -fuse-ld=lld not_used.o main.o -o res.out
addr2line -e res.out 0x2010e7
/home/avl/bugs/gc_debuginfo/main.cpp:3
I have no more comments, this looks fine to me now.
A minor nit is inline.
lld/test/ELF/gc-sections-debuginfo.s | ||
---|---|---|
11 ↗ | (On Diff #191909) | Hint: you can use -check-prefixes=CHECK,CHECK32 |
I think at this point the "debug info cabal" is more in favor of fragmenting the DWARF emitted by the compiler, so that it can be gc'd in tandem with the related functions. That solution costs extra relocations and extra sections in the .o file, but uses all existing mechanisms in the linker to achieve the goal.
But that opinion is based on the chat here and in D54747, and without a real llvm-dev discussion behind it.
I think at this point the "debug info cabal" is more in favor of fragmenting the DWARF emitted by the compiler, so that it can be gc'd in tandem with the related functions.
AFAIU this decision does not make this patch obsolete. Until above solution implemented or if some concrete dwarf producer would not implement that solution at all - it would be useful to have better line reporting. This patch improves llvm-symbolizer and addr2line reporting right after it would be integrated.
It also does not prevent future implementation of better solution (including D54747). D54747 solves some sort of problems but not all. There would still be cases with overlapping address ranges because of not removed debug info.
I think it makes sense to integrate that patch and keep it when(if) above "fragmenting the DWARF " solution would be done.
A resolution on the compiler/assembler side of things (basically i.e. not rewriting DWARF) will still need something in the linker for a while, due to old objects, if we want to have a perfect experience.
lld/ELF/InputSection.cpp | ||
---|---|---|
930 | This will patch any non-alloc section that is relocated, not just .debug* sections. I think it should be scoped to those kinds of sections only, as it could break the behaviour for other unknown sections that people may have. | |
931 | If relocation -> If the relocation You don't need to put "-Wl," in the switch, since we're in the linker here already. Just use --gc-sections directly. | |
932 | to make start offset to be out of -> to make the start offset be outside the | |
933 | to not to overlap with -> to not overlap with the | |
934 | This bit doesn't quite make sense. Perhaps "We use UINT64_MAX-1 because..." | |
lld/test/ELF/gc-sections-debuginfo.s | ||
1 ↗ | (On Diff #191909) | General nit throughout this test: you are using a mixture of '-' and '--' for long switches, which is a bit jarring. Please switch to using '--' throughout, if possible. This test should test other debug section patching too, as the problem is not specific to .debug_info. The ones I'm thinking of in particular are .debug_types, .debug_ranges, .debug_line and .debug_aranges. Since the behaviour is actually patching of any debug section, perhaps it would be better to just call the section .debug_foo or similar? |
4 ↗ | (On Diff #191909) | I don't know what the common format in LLD tests is off the top of my head, but I find it easier if the '|' is on the following line: # RUN: echo .... \ # RUN: | llvm-mc ... |
6 ↗ | (On Diff #191909) | Nit: spurious extra space between -sections and -section-data |
11 ↗ | (On Diff #191909) | Also applies above. |
14 ↗ | (On Diff #191909) | Replace: |
15 ↗ | (On Diff #191909) | sections. But their... -> sections, but their Use "debug data" instead of "debug_info" throughout here, because .debug_info is just one section in the debug data. |
17 ↗ | (On Diff #191909) | address'es -> addresses |
18 ↗ | (On Diff #191909) | out of module -> out of the module |
19 ↗ | (On Diff #191909) | relocation from -> relocations from the |
20 ↗ | (On Diff #191909) | to deleted section .text -> to a deleted .text section |
lld/ELF/InputSection.cpp | ||
---|---|---|
930 | Hmmm... I do not agree here. I think if the user's section assumes that linker resolves the relocation to a removed section as a zero, that does not seem the correct assumption to me. I doubt we should care that we switched the behavior here (or at least we probably want to know about such cases before explicitly supporting them, I think). |
lld/ELF/InputSection.cpp | ||
---|---|---|
930 | In some previous version of the patch there was a check for Name.startswith(".debug") which I was asked to remove because "checking the symbol name is very hacky". Would it be OK to return it here ? Or probably solution similar to https://reviews.llvm.org/D54747 would be OK ? (create a flag Debug, set it Debug = Name.startswith(".debug"), and check it here) |
lld/ELF/InputSection.cpp | ||
---|---|---|
930 | I don't feel strongly about it, so if there's opposition, what is here is probably fine, I think. | |
931 | Nit: the -> a | |
lld/test/ELF/gc-sections-debuginfo.s | ||
4 ↗ | (On Diff #192656) | Could you indent this like you were before, please? Same below. |
15 ↗ | (On Diff #192656) | info left -> data is left |
16 ↗ | (On Diff #192656) | overlapp -> overlap |
Apologies that I didn't check the information on the original bug. The GDB behaviour does look reasonable, thanks.
lld/ELF/InputSection.cpp | ||
---|---|---|
930 | Checking the section name (not symbol name) is not *hacky* in this case. Section names that start with the prefix ".debug" are reserved "for the system". Checking that the section name starts with the prefix ".debug" is the correct ELF way to determine if it is a debug information section. |
lld/test/ELF/gc-sections-debuginfo.s | ||
---|---|---|
4 ↗ | (On Diff #192656) | I did not get it. What exactly should be done ? this :
should be change into this ?
|
lld/ELF/InputSection.cpp | ||
---|---|---|
930 | But testing the symbol name inside a hot relocation loop and changing the behavior depending on that is itself a bit hacky, isn't? | |
lld/test/ELF/gc-sections-debuginfo.s | ||
4 ↗ | (On Diff #192656) | I think James wanted you to add a double space (identation) before llvm-mc: # RUN: echo '.section .text,"a"; .byte 0; .section .debug_foo,"",@progbits; .quad .text' \ # RUN: | llvm-mc --filetype=obj --arch=x86-64 - -o %t |
lld/ELF/InputSection.cpp | ||
---|---|---|
931 | Sorry, one more comment here, now that I read it again. Change: COMDATs are removed even without --gc-sections, for example. | |
lld/test/ELF/gc-sections-debuginfo.s | ||
4 ↗ | (On Diff #192656) | Yes, exactly this. It makes it a little easier to read this as a continuation. |
Thanks. I have no more comments. I think this would be a good thing to do, but it definitely needs some wider agreement from @ruiu and the other LLD contributors.
lld/ELF/InputSection.cpp | ||
---|---|---|
936 | Where in this document can I find a description why UINT64_MAX is special? Is using UINT64_MAX-1 as an "invalid value" just a local convention of lld (i.e. are you inventing a new notion) or a common practice? Is there any more common value representing an invalid address than UINT64_MAX-1? |
lld/ELF/InputSection.cpp | ||
---|---|---|
936 | It is based on the comment by @jhenderson in start of this thread. He mentioned that -1 has issues in .debug_ranges. I would find the real description of using -1 for .debug_ranges and put here... |
lld/ELF/InputSection.cpp | ||
---|---|---|
936 | It's fine if -1 has a special meaning, but it doesn't necessarily mean -2 is the second best. For example, why don't you use 0? What's special about -2? Or is it just arbitrary? |
lld/ELF/InputSection.cpp | ||
---|---|---|
936 | We need to prevent address clashing for debug info referenced to valid sections and to deleted sections. To achieve that it is necessary to set an address which is out of module scope. So the idea is to set maximal allowed address value. UINT64_MAX is maximal value but used by .debug_aranges. UINT64_MAX -1 is the next maximal value. address range 0...0+LENGTH could clash with existing range. |
I thought that address 0 is a good candidate to represent an invalid address, as nullptr is usually represented by address 0. I wonder why we want to use UINT64_MAX-2 instead of 0.
lld sets x86_64 image base to 0x200000.
A .debug_info associated with a non-discarded section has DW_AT_low_pc > 0x200000
A .debug_info associated with a discarded section has DW_AT_low_pc = 0. If its size is larger than 0x200000 => DW_AT_high_pc > image_base => the two debug info entries overlap => llvm-symbolizer locates a DIE that's not what the user expects => unexpected filename (not_used.cpp)
If you make ".rept 2000000; nop; .endr" larger, you'll notice bfd linked executable has the same problem (its image base is 0x400000).
InputSection.cpp:833
Target->relocateOne(BufLoc, Type, SignExtend64<Bits>(Sym.getVA(Addend)));
I think the current handling in lld is quite good, no need to special case on if (!Sym.getOutputSection()). The same problem can be reproduced for ld.bfd and gold.
I think the right fix is to improve the heuristics used in llvm-symbolizer, rather than change lld. llvm-symbolizer may deal with executables linked by ld.bfd and gold.
One change we can make is that, in DWARFDebugAranges::extract, if LowPC is zero and the file is not a relocatable object, don't call appendRange(CUOffset, LowPC, HighPC);.
A workaround is to set a larger image base with --image-base=.
One change we can make is that, in DWARFDebugAranges::extract, if LowPC is zero and the file is not a relocatable object, don't call appendRange(CUOffset, LowPC, HighPC);.
I'm not sure if making zero DW_AT_low_pc special for executable/DSO in DWARFDebugAranges is a good idea.
I thought that address 0 is a good candidate to represent an invalid address, as nullptr is usually represented by address 0. I wonder why we want to use UINT64_MAX-2 instead of 0.
using 0 in this case could lead to following problems :
- on some platforms 0 is a valid address. i.e. image base starts from 0. In that case no way to differentiate valid address from invalid one just by checking for 0.
- if size of deleted section is long enough then debug range could clash with valid section address range. That leads to incorrect debug info selecting. As written in another message :
lld sets x86_64 image base to 0x200000.
A .debug_info associated with a non-discarded section has DW_AT_low_pc > 0x200000
A .debug_info associated with a discarded section has DW_AT_low_pc = 0. If its size is larger than 0x200000 => DW_AT_high_pc > image_base => the two debug info entries overlap => llvm->symbolizer locates a DIE that's not what the user expects => unexpected filename (not_used.cpp)
Using UINT64_MAX-1 prevents from above problems and allow to select correct DWARF entry.
in other words: using 0 as start of address range could lead to overlapping with valid address range.
using UINT64_MAX-1 as start of address range makes address range of deleted section be outside of module address space and would not clash with existing ranges.
I think the current handling in lld is quite good, no need to special case on if (!Sym.getOutputSection()). The same problem can be reproduced for ld.bfd and gold.
I think the right fix is to improve the heuristics used in llvm-symbolizer, rather than change lld. llvm-symbolizer may deal with executables linked by ld.bfd and gold.
But there is no way to heuristically detect this case which works generally. On some platforms image_base could start from 0. llvm-symbolizer just does not have enough information to decide whether address range correct or not. At the same time linker does have such information. So it looks correct to do that processing in linker. Linker knows that section deleted and able to set not matching address while resolving relocations pointed to this section.
Good general solution would be to delete not used debug info.
But for present time when that information exist - using UINT_MAX64-1 is a good way to resolve ambiquity. It is better than some heuristic which does not work for all cases.
I'm not sure if making zero DW_AT_low_pc special for executable/DSO in DWARFDebugAranges is a good idea.
That could be seen not as special value but as a value which does not clash with other ranges. For example Linker could(as an imaginary processing) assign address to the section and delete it after address assigned. In that case relocation could be resolved into this not overlapped area. Using UINT64_MAX-1 is similar to that handling. The difference is only that it simplified by taking maximal address value instead of assign an unused area.
__libc_csu_init is in libc.a(elf-init.o) and has no debug info. It should symbolize to ?? but not_used.cpp is incorrectly returned. ?? -> not_used.cpp
A more serious problem is when a main.cpp symbol incorrectly resolves to not_used.cpp. I created D60470 to fix that.
__libc_csu_init is in libc.a(elf-init.o) and has no debug info. It should symbolize to ?? but not_used.cpp is incorrectly returned. ?? -> not_used.cpp
I did not see that problem if this fix applied. Everything looks exactly like you expect:
nm res.out
0000000000201100 T __libc_csu_init
U __libc_start_main
00000000002010f0 T main
llvm-symbolizer -obj=res.o 0x0000000000201100
__libc_csu_init
??:0:0 <<<<<<<<<<<<<<<<<<<<<<<<<<<
llvm-symbolizer -obj=res.o 0x00000000002010f0
main
main.cpp:2:4 <<<<<<<<<<<<<<<<<<<<<<<<<<<
Did I correctly understand the problem which you are talking about?
A more serious problem is when a main.cpp symbol incorrectly resolves to not_used.cpp. I created D60470 to fix that.
Would solution from D60470 work correctly for platform where 0 is a valid virtual address ?
lld/ELF/InputSection.cpp | ||
---|---|---|
936 | I think -1 also works, no need to use -2. -1 is used somewhere in llvm/lib/DebugInfo/DWARF/ to indicate "the value is absent", exactly what we want to express here. |
But this is also true to UINT64_MAX-2 because you can put your code or data there, no?
- if size of deleted section is long enough then debug range could clash with valid section address range. That leads to incorrect debug info selecting. As written in another message :
lld sets x86_64 image base to 0x200000.
A .debug_info associated with a non-discarded section has DW_AT_low_pc > 0x200000
A .debug_info associated with a discarded section has DW_AT_low_pc = 0. If its size is larger than 0x200000 => DW_AT_high_pc > image_base => the two debug info entries overlap => llvm->symbolizer locates a DIE that's not what the user expects => unexpected filename (not_used.cpp)Using UINT64_MAX-1 prevents from above problems and allow to select correct DWARF entry.
in other words: using 0 as start of address range could lead to overlapping with valid address range.
using UINT64_MAX-1 as start of address range makes address range of deleted section be outside of module address space and would not clash with existing ranges.
yes. that`s true. But using UINT64_MAX-1 to assign addresses looks more like an artificial case. It is natural to start assign addresses from start of address space(which could be 0). It is much less common practice(AFAIK) to assign addresses in the very end of address space.
In the very first version of that path there was a code which analyzes sections for assigned addresses and select an address which does not clash with already assigned module`s ranges. It was suggested to simplify this processing to use UINT64_MAX-1 instead of analyzing address ranges. Assuming that it is really rare case when UINT64_MAX-1 would clash with existing module`s ranges. So using UINT64_MAX-1 is "simple, fast, good enough" solution which works better than 0. i.e. there would be much less cases when addresses would clash for UINT64_MAX-1 than for 0.
lld/ELF/InputSection.cpp | ||
---|---|---|
936 | -1 used as a special value indicating "base address selection entry" in debug range list processing. Since we do not need to indicate "base address selection entry" here then it would probably be better to use -2. |
@MaskRay Hi, Since you changed status of review into "Needs Revision" - What changes do you think should be done ?
My thoughts about this change are split. I wanted to reset the status of review into "Normal" but I didn't how to do that :( I didn't want to change it to "Accepted" so I chose "Resign". It seems it didn't restore the status back to normal and only you can do that...
What I don't like
- complexity. Two lines are two lines.
- The demonstration case is a bit contrived. Filling in a garbage collected text section with zeros/nops to overlap with another text section doesn't seem common. The problems aren't new but lld's layout mitigates many(most?) of the problems. See https://reviews.llvm.org/D60470#1462246
- Existing tools have learned various heuristics to avoid 0 DW_AT_low_pc collision, but I am not sure what they'd do if those relocations resolve to -1. llvm-symbolizer is certainly good (I have checked a few examples - I'm learning it recently), but what about other tools?
What I like
- -1 or -2 makes the resolved relocation outstanding -> a different value from 0 makes it more obvious. Avoiding -1 just because "base address selection entry" uses -1 is not a strong argument IMHO. As long as there ranges are invalid (or valid but unrealistic to collide with good ones) we are good.
@ruiu What do you think about using UINT64_MAX-1 as address to which relocations for deleted sections would be resolved ? Does it make sense ?
Existing tools have learned various heuristics to avoid 0 DW_AT_low_pc collision, but I am not sure what they'd do if those relocations resolve to -1. llvm-symbolizer is certainly good (I have checked a few >examples - I'm learning it recently), but what about other tools?
addr2line started to behave correctly :
[LD]clang++ -gdwarf-5 -O -Wl,--gc-sections -fuse-ld=ld not_used.o main.o -o res.out
addr2line -e res.out 0x400497
/home/avl/bugs/gc_debuginfo/not_used.cpp:2 <<<<<<<<<<<<<<<<<<<<<<<<<< incorrect if built by ld
[LLD]clang++ -gdwarf-5 -O -Wl,--gc-sections -fuse-ld=lld not_used.o main.o -o res.out
addr2line -e res.out 0x2010e7
/home/avl/bugs/gc_debuginfo/main.cpp:3 <<<<<<<<<<<<<<<<<<<<<<< correct if built by lld with this patch
I find a previous discussion at https://bugs.llvm.org/show_bug.cgi?id=37212 and https://reviews.llvm.org/D46502
I find a previous discussion at https://bugs.llvm.org/show_bug.cgi?id=37212 and https://reviews.llvm.org/D46502
this is another case(other than -ffunction-sections) where using UINT64_MAX-1 as address value for debug info related to discarded sections would help to dump debug info more correct.
Can I have your answers to my questions at https://reviews.llvm.org/D59553#1467876? I have said the example is contrived, and the zero DW_TAG_low_pc issue causes more problems to ld.bfd/gold linked modules than to lld linked modules.
Excuse me, I did not realize that those were questions. Please check my answers :
complexity. Two lines are two lines.
That is what I do not understand. Why two-lines patch considered to be complex?
The demonstration case is a bit contrived. Filling in a garbage collected text section with zeros/nops to overlap with another text section doesn't seem common. The problems aren't new but lld's layout mitigates many(most?) of the problems. See https://reviews.llvm.org/D60470#1462246
I do not think that the demonstration case is a contrived. The problem probably happened less frequently for common x86 case when image-base starts from big values. But there are other platforms/compilation languages/linking options/linkers where this problem happens more often. The test case shows the problem. I do not think it would be better if the test case would be bigger and more complex to show some realistic common x86 case. I also do not think that zero DW_TAG_low_pc issue causes more problems to ld.bfd/gold linked modules than to lld linked modules. There many combinations of platforms/compilation languages/linking options/linkers which could lead into the described problem. In various combinations zero DW_TAG_low_pc issue would be bigger problem for one linker and smaller problem for another(because of different linker behavior).
Existing tools have learned various heuristics to avoid 0 DW_AT_low_pc collision, but I am not sure what they'd do if those relocations resolve to -1. llvm-symbolizer is certainly good (I have checked a few examples - I'm learning it recently), but what about other tools?
addr2line started to behave correctly :
[LD]clang++ -gdwarf-5 -O -Wl,--gc-sections -fuse-ld=ld not_used.o main.o -o res.out
addr2line -e res.out 0x400497
/home/avl/bugs/gc_debuginfo/not_used.cpp:2 <<<<<<<<<<<<<<<<<<<<<<<<<< incorrect if built by ld
[LLD]clang++ -gdwarf-5 -O -Wl,--gc-sections -fuse-ld=lld not_used.o main.o -o res.out
addr2line -e res.out 0x2010e7
/home/avl/bugs/gc_debuginfo/main.cpp:3 <<<<<<<<<<<<<<<<<<<<<<< correct if built by lld with this patch
I've some previous discussions PR37212 PR37891 etc. -2 or -1 may be benign to some non-allocated sections like .debug.*, but may not be good to others. The consumers may get used to treat 0 DW_AT_low_pc special but the patch may make some consumers code more special values. lld doesn't implement the CB_PRETEND logic in GNU linkers:
- gold CB_PRETEND https://sourceware.org/git/?p=binutils-gdb.git;a=blob;hb=9d6d4be89d12747a92629ed1bde1d423e2831de1;f=gold/target-reloc.h#l414
- ld.bfd action_discarded PRETEND https://sourceware.org/git/?p=binutils-gdb.git;a=blob;hb=9d6d4be89d12747a92629ed1bde1d423e2831de1;f=bfd/elflink.c#l10851
Maybe you can raise an issue on https://sourceware.org/bugzilla , asking if the relocation can be resolved to -2 if the PRETEND logic fails, or asking if addr2line should be fixed instead. I raised some issues in the weekend and the maintainers seem super responsive.
I think it is quite possible that we just need the llvm-symbolizer fix D60470 and an addr2line fix to make everything work out. Debuggers have had the heuristics for years.
But there are other platforms/compilation languages/linking options/linkers where this problem happens more often.
If you happen to know one combination, please elaborate :)
I also do not think that zero DW_TAG_low_pc issue causes more problems to ld.bfd/gold linked modules than to lld linked modules.
The problem probably happened less frequently for common x86 case when image-base starts from big values.
The image base doesn't matter (is zero) for -shared/-pie. I've shared my thoughts why this doesn't likely cause problems to lld. In lld, the PF_R PT_LOAD is placed before the PF_R|PF_X PT_LOAD (text segment). If you have a compilation unit that is sufficiently large (large DW_AT_high_pc, 0 DW_AT_low_pc), it is reasonable to have a sufficiently large PF_R PT_LOAD. Then the range will not collide with a good compilation unit with large DW_AT_low_pc.
On the GNU linkers' side, ld.bfd -z separate-code is a recent addition and it doesn't move the large .rodata etc before .text. gold hasn't implemented -z separate-code. They are more vulnerable to the issue.
I've some previous discussions PR37212 PR37891 etc. -2 or -1 may be benign to some non-allocated sections like .debug.*, but may not be good to others.
Could you explicitly describe that case, please ? i.e. When using UINT64_MAX-1 as start of address range for relocations pointing to deleted code would not be good for others ?
The consumers may get used to treat 0 DW_AT_low_pc special but the patch may make some consumers code more special values.
that would be a error. Treating 0 DW_AT_low_pc as a special value would not be correct because there is no such documented special value. Consumers should not rely that linker resolves the relocation to a removed section as a zero.
this patch does not add additional special value. consumer code would not be required to add more special values.
lld doesn't implement the CB_PRETEND logic in GNU linkers:
gold CB_PRETEND https://sourceware.org/git/?p=binutils-gdb.git;a=blob;hb=9d6d4be89d12747a92629ed1bde1d423e2831de1;f=gold/target-reloc.h#l414 ld.bfd action_discarded PRETEND https://sourceware.org/git/?p=binutils-gdb.git;a=blob;hb=9d6d4be89d12747a92629ed1bde1d423e2831de1;f=bfd/elflink.c#l10851Maybe you can raise an issue on https://sourceware.org/bugzilla , asking if the relocation can be resolved to -2 if the PRETEND logic fails, or asking if addr2line should be fixed instead. I raised some issues in the weekend and the >maintainers seem super responsive.
PRETEND case is a related but separate issue. PRETEND is about "assign address range of selected COMDAT section to the references to deleted COMDAT section".
This patch is about - "assign not-clashing address value to the references for the deleted code".
I can raise an issue there but I do not think this discussion should depend on https://sourceware.org/bugzilla.
I think it is quite possible that we just need the llvm-symbolizer fix D60470 and an addr2line fix to make everything work out. Debuggers have had the heuristics for years.
I do not agree with this.
llvm-symbolizer and addr2line just do not have enough information to properly select correct address range.
for the platform where image-base starts from 0 - D60470 will not help to select correct address range. Right ?
I think there are three levels of solutions for this problem :
- Best one: Delete wrong debug info. In this case the problem just would not arise.
- Good enough: Set not clashing address range for code pointing to deleted sections while resolving relocations in linker. linker knows which address range is correct. So it could unambiguously assign proper value.
- Less powerful: implement heuristics in llvm-symbolizer and addr2line. heuristics do not work for all cases. Thus wrong behavior would still exists.
But there are other platforms/compilation languages/linking options/linkers where this problem happens more often.
If you happen to know one combination, please elaborate :)
sure.
- x86 platform and -image-base=0 makes this problem to happen much more often.
- there are quite many platforms which already has -image-base starting from zero. This is often used for embedded systems. as an example - please check message from @jhenderson https://reviews.llvm.org/D60470#1462375 . In such case this problem happens much more often.
- there are programming languages not using exceptions intensively. So they do not create eh_frame section at all or have it very small. This would lead to not shift start of text section much (as in your explanation why the problem is less important for lld).
The image base doesn't matter (is zero) for -shared/-pie. I've shared my thoughts why this doesn't likely cause problems to lld.
In lld, the PF_R PT_LOAD is placed before the PF_R|PF_X PT_LOAD (text segment). If you have a compilation unit that is sufficiently
large (large DW_AT_high_pc, 0 DW_AT_low_pc), it is reasonable to have a sufficiently large PF_R PT_LOAD.
Then the range will not collide with a good compilation unit with large DW_AT_low_pc.
On the GNU linkers' side, ld.bfd -z separate-code is a recent addition and it doesn't move the large .rodata etc before .text.
gold hasn't implemented -z separate-code. They are more vulnerable to the issue.
I do not really understand why we are discussing that question?
Even if that problem is less important for lld than for others linkers - it still exists.
So we need to patch lld anyway.
Nevertheless above description based on the fact that :
" If you have a compilation unit that is sufficiently
large (large DW_AT_high_pc, 0 DW_AT_low_pc), it is reasonable to have a sufficiently large PF_R PT_LOAD.
Then the range will not collide with a good compilation unit with large DW_AT_low_pc."
taking into account that :
ld sets default image base to 0x400000 for x86_64.
gold default image base is 0x400000 for x86_64.
lld default image base is 0x200000 for x86_64
- size of sections put before .text section by lld should be greater than 0x200000 to have a benefit comparing with other linkers. In case size of sections would be less than 0x200000 then lld would have a problem but other linkers would not.
- Actually it is not correct to link size of .text section and size of previously put sections as in above assumption. Size of text sections should be compared with size of deleted code. if there were deleted big piece of data then broken range could be big - 0x0-0x300000 f.e; But correct code could be smaller(size of text section is 1000 f.e) - in this case there would be overlapping address ranges in-spite of the fact that lld put some sections before .text section.
Could you explicitly describe that case, please ? i.e. When using UINT64_MAX-1 as start of address range for relocations pointing to deleted code would not be good for others ?
You are the one who propose to change the status quo and use UINT64_MAX-1. The burden of proof is on your side. You have the obligation to collect evidence from approaches taken by various DWARF consumers to prove the newly introduced special value wouldn't cause problems. However, I'd be happy to dig up history of various debuggers to learn what they do when I am free.
I also want to make it clear: your code changes the general behavior that applies to any non-allocated section, not something as specific as how a DW_AT_low_pc in a .debug_info is resolved. .rela.debug_* is a case of relocations into non-allocated sections but it isn't the only case. Do we have other such sections that may be thwarted (say -2 might be harmful to them) by your patch?
llvm-symbolizer and addr2line just do not have enough information to properly select correct address range.
If other DWARF consumers (debuggers, dtrace, profilers, etc) are happy with 0 DW_AT_low_pc, the problem is probably on the symbolizers' side. And I have sufficient argument that addr2line doesn't do well and I am about to file a bug.
taking into account that :
ld sets default image base to 0x400000 for x86_64.
gold default image base is 0x400000 for x86_64.
lld default image base is 0x200000 for x86_64
- size of sections put before .text section by lld should be greater than 0x200000 to have a benefit comparing with other linkers. In case size of sections would be less than 0x200000 then lld would have a problem but other linkers would not.
- Actually it is not correct to link size of .text section and size of previously put sections as in above assumption. Size of text sections should be compared with size of deleted code. if there were deleted big piece of data then broken range could be big - 0x0-0x300000 f.e; But correct code could be smaller(size of text section is 1000 f.e) - in this case there would be overlapping address ranges in-spite of the fact that lld put some sections before .text section.
You arguments here make me think we are not on the same page. The image base is used only by non-PIE executable, neither -pie nor -shared. The .debug_info input section associated with a discarded text section should have its DW_AT_high_pc larger than the DW_AT_low_pc of a good compilation unit to cause collision, not less than 0x200000 as you said.
size of previously put sections
Actually they matter a lot. lld places the PF_R PT_LOAD segment before PF_R|PF_X PT_LOAD (text segment). The text segment contains compilation units. If the PF_R is sufficiently large, it will make the relative address of the text segment larger, thus unlikely to collide.
You are the one who propose to change the status quo and use UINT64_MAX-1.
The burden of proof is on your side. You have the obligation to collect evidence from
approaches taken by various DWARF consumers to prove the newly introduced special value wouldn't cause problems
Sure. I though you have in mind something concrete.
I checked behavior of addr2line and gdb -they are working correctly with this patch.
Could you tell which check would be good from your opinion to do more, please ?
I also want to make it clear: your code changes the general behavior that applies to any non-allocated section,
not something as specific as how a DW_AT_low_pc in a .debug_info is resolved. .rela.debug_*
is a case of relocations into non-allocated sections but it isn't the only case.
Do we have other such sections that may be thwarted (say -2 might be harmful to them) by your patch?
Originally that patch was limited to only .debug* sections. But it was requested to extend it to all non-allocated section.
https://reviews.llvm.org/D59553#1445935
Since there is no requirement that certain sections should be assigned to 0 then I agree it should not be a problem to assign any other suitable value for them.
If other DWARF consumers (debuggers, dtrace, profilers, etc) are happy with 0 DW_AT_low_pc,
the problem is probably on the symbolizers' side. And I have sufficient argument that addr2line doesn't do well and I am about to file a bug.
actually other DWARF consumers are not happy. They implemented various heuristics.
Instead of implementing various, potentially not matching, heuristics it is possible to just avoid address ranges clash.
Could you confirm that solution suggested in D60470 could not handle following situation correctly ?
One piece of debug info references removed address range DW_AT_low_pc=0x0 DW_AT_high_pc=0x10000
Another piece of debug info references correct address range DW_AT_low_pc=0x0 DW_AT_high_pc=0x10000
You arguments here make me think we are not on the same page. The image base is used only by non-PIE executable,
neither -pie nor -shared. The .debug_info input section associated with a discarded text section
should have its DW_AT_high_pc larger than the DW_AT_low_pc of a good compilation unit to cause collision, not less than 0x200000 as you said.
One piece of debug info references removed address range DW_AT_low_pc=0x0 DW_AT_high_pc=0x350000
Another piece of debug info references correct address range DW_AT_low_pc=0x300000 DW_AT_high_pc=0x390000
in this case DW_AT_high_pc of discarded text section larger than the DW_AT_low_pc of a good compilation unit.
lld put tables which size is 0x100000 before .text section so DW_AT_low_pc=0x300000.
lld generated executable would have a problem. ld and gold generated would not.
Actually they matter a lot. lld places the PF_R PT_LOAD segment before PF_R|PF_X PT_LOAD (text segment).
The text segment contains compilation units. If the PF_R is sufficiently large, it will make the relative address of the text segment larger, thus unlikely to collide.
they matter if they are significantly large - otherwise they are not.
One piece of debug info references removed address range DW_AT_low_pc=0x0 DW_AT_high_pc=0x280000
Another piece of debug info references correct address range DW_AT_low_pc=0x202000 DW_AT_high_pc=204000
in this case DW_AT_high_pc of discarded text section larger than the DW_AT_low_pc of a good compilation unit.
lld put tables which size is 0x2000 before .text section so DW_AT_low_pc=202000.
lld generated executable would have a problem. ld and gold generated would not.
actually other DWARF consumers are not happy.
Example please.
I checked behavior of addr2line and gdb -they are working correctly with this patch.
Can you check if gdb works without this patch?
Could you confirm that solution suggested in D60470 could not handle following situation correctly ?
Another piece of debug info references correct address range DW_AT_low_pc=0x0 DW_AT_high_pc=0x10000
D60470 doesn't specialize case zero address, it changes priority. I suggest you actually take a look at that patch. The behavior of ld.bfd/gold will unlikely to change. D60470 will also help them.
I want to see a concrete example where 0 is used as a valid DW_AT_low_pc.
I can change vm.mmap_min_addr to 0 on my Linux and mmap a fixed page at 0 freely, but that doesn't mean 0 can be a valid DW_AT_low_pc. The 0 page can be mapped but more likely used as file header, irq, etc but unlikely to be used as code.
lld generated executable would have a problem. ld and gold generated would not.
You may check how the R and RW PT_LOAD segments are laid out in ld.bfd and gold linked modules.
You were also wrong with your understanding of image base (it was irrelevant in -shared/-pie) in your previous comments.
I just feel my explanation about how lld's layout largely mitigates the issue is kept being ignored.
In Sony executables, the file header is not stored in loadable memory. Our first PT_LOAD has address 0 and sections start from that address. Technically, we don't have any code at that address yet, but we are planning on removing the limitation that stops us from doing so, in which case we will have code at address 0. As mentioned earlier, we have solved the problem for now by using a special value in DW_AT_low_pc, and other techniques, but these all require private patches, so an alternative would be nice.
I do agree that this patching should be limited to debug sections though. Other consumer sections might have other rules regarding special values etc, and be working on the assumption of patching to 0 is the best plan.
actually other DWARF consumers are not happy.
Example please.
please check following behavior of lldb, llvm-symbolizer, gnu addr2line, gnu objdump :
$ cat main.cpp
void foo_used();
int main(void) {
foo_used(); return 0;
}
$ cat funcs.cpp
void foo_not_used () {
__asm__(".rept 2105344; nop; .endr");
}
void foo_used () {
__asm__(".rept 10000; nop; .endr");
}
$ clang++ -gdwarf-4 -O funcs.cpp -ffunction-sections -c
$ clang++ -gdwarf-4 -O main.cpp -ffunction-sections -c
$ clang++ -gdwarf-4 -O funcs.o main.o -fuse-ld=lld -Wl,--gc-sections -o res.out
$ lldb res.out
(lldb) disassemble -name main
res.out`main:
res.out[0x203810] <+0>: pushq %rax
res.out[0x203811] <+1>: callq 0x2010f0 ; foo_not_used + 240
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
res.out[0x203816] <+6>: xorl %eax, %eax
res.out[0x203818] <+8>: popq %rcx
res.out[0x203819] <+9>: retq
(lldb) b foo_used
Breakpoint 1: where = res.out`foo_not_used() + 240, address = 0x00000000002010f0
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
$ llvm-symbolizer -obj=res.out 0x00000000002010f0
foo_used()
/home/avl/bugs/gc_debuginfo/funcs.cpp:2:5
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
$ addr2line -e res.out 0x00000000002010f0
/home/avl/bugs/gc_debuginfo/funcs.cpp:2
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
$ objdump -d -S res.out
0000000000201000 <_start>:
void foo_not_used () {
__asm__(".rept 2105344; nop; .endr");
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
201000: 31 ed xor %ebp,%ebp 201002: 49 89 d1 mov %rdx,%r9 201005: 5e pop %rsi
And this patch makes them happy:
$ lldb res.out
(lldb) disassemble -name main
res.out`main:
res.out[0x203810] <+0>: pushq %rax
res.out[0x203811] <+1>: callq 0x2010f0 ; foo_used at funcs.cpp:6:5
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
res.out[0x203816] <+6>: xorl %eax, %eax
res.out[0x203818] <+8>: popq %rcx
res.out[0x203819] <+9>: retq
(lldb) b foo_used
Breakpoint 1: where = res.out`foo_used() at funcs.cpp:6:5, address = 0x00000000002010f0
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
$ llvm-symbolizer -obj=res.out 0x00000000002010f0
foo_used()
/home/avl/bugs/gc_debuginfo/funcs.cpp:6:5
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
$ addr2line -e res.out 0x00000000002010f0
/home/avl/bugs/gc_debuginfo/funcs.cpp:6
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
$ objdump -d -S res.out
0000000000201000 <_start>:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
201000: 31 ed xor %ebp,%ebp 201002: 49 89 d1 mov %rdx,%r9 201005: 5e pop %rsi 201006: 48 89 e2 mov %rsp,%rdx 201009: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp
I checked behavior of addr2line and gdb -they are working correctly with this patch.
Can you check if gdb works without this patch?
gdb works correctly without this patch for cases when zero virtual address is not a correct value for code.
Because it uses 0 as special value indicating that address range is incorrect.
It would not work correctly for the platform where 0 is a valid vma for code.
It is generally wrong to use 0 as a special value.
Using 0 as the beginning of the section with code is not forbidden.
I want to see a concrete example where 0 is used as a valid DW_AT_low_pc.
- according to @jhenderson Sony has/would have in future such use case.
- according to this http://lists.llvm.org/pipermail/lldb-dev/2017-March/012091.html Qualcomm Kalimba DSP and the XAP RISC CPU have this use case. they even use similar to this patch approach to solve the problem.
- Such use case often used for embedded systems.
I personally do not have access to such systems to demonstrate a use case for you.
lld generated executable would have a problem. ld and gold generated would not.
You may check how the R and RW PT_LOAD segments are laid out in ld.bfd and gold linked modules.
For the above example :
$ clang++ -gdwarf-4 -O funcs.o main.o -fuse-ld=lld -Wl,--gc-sections -o res.out
$ addr2line -e res.out 0x00000000002010f0
/home/avl/bugs/gc_debuginfo/funcs.cpp:2
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
$ clang++ -gdwarf-4 -O funcs.o main.o -fuse-ld=ld -Wl,--gc-sections -o res.out
$ addr2line -e res.out 0x00000000004004a0
/home/avl/bugs/gc_debuginfo/funcs.cpp:6
$ clang++ -gdwarf-4 -O funcs.o main.o -fuse-ld=gold -Wl,--gc-sections -o res.out
$ addr2line -e res.out 0x0000000000400510
/home/avl/bugs/gc_debuginfo/funcs.cpp:6
lld generated binary has problem, ld and gold do not have.
I do agree that this patching should be limited to debug sections though.
Other consumer sections might have other rules regarding special values etc, and be working on the assumption of patching to 0 is the best plan.
OK, I would change a patch accordingly and update shortly.
I don't see a problem with changing lldb's behavior to add special treatment for address zero. It's not the most ideal solution (that would be to avoid generating the debug info in the first place), but given that we have to accept inputs from other linkers too (which also implement this behavior), it's probably the best thing we can do now.
In general it is not possible to handle zero case properly in lldb. It does not have information which address range is correct.
if platform allows section to contain code starting from zero then it is not possible to detect good and bad address range:
One piece of debug info references removed address range DW_AT_low_pc=0x0 DW_AT_high_pc=0x10000
Another piece of debug info references correct address range DW_AT_low_pc=0x0 DW_AT_high_pc=0x10000
which address range should be selected ?
Solution from this patch prevents such situations. lld would not generate clashed correct address ranges at all.
Do you think it is bad to have a solution which allows to not to have clashed address ranges ?
lldb still is able to implement some heuristics for code generated by ld/gold.(which would also not work for zero-address platform)
But binaries generated by lld would be properly handled by all DWARF consumers without any change in them.
Because there would not be clashed addresses.
limit resolving relocations to UINT64_MAX-1 value for only debug sections, as requested.
I'm personally happy with the implementation now, but obviously there needs to be agreement from the other reviewers.
ping. @ruiu Do you think it would be useful to integrate this patch ?
this patch already has Looks Good from @jhenderson and @grimar .
It helps lldb, llvm-symbolizer, gnu addr2line, gnu objdump to show debug info properly in case clashed address ranges.
I'm fine with this patch. We've discussed a lot about what is a desirable value for a "null" for debug info, but looks like as long as it does not appear as a valid address, it should be fine.
Is Eric fine with this?
@avl I suspect we probably should implement the PRETEND/CB_PRETEND logic in lld instead, to properly fix https://reviews.llvm.org/D63772
I'll need to do some experiments. If we have PRETEND, we will not need the -2 hack in relocateNonAlloc
@MaskRay Hi MaskRay, PRETEND is a connected but separate issue. I agree it would be good to implement some kind of the PRETEND/CB_PRETEND logic in lld. But it does not make obsolete this patch. implementing of PRETEND logic answers the question "how to select proper debug info for COMDAT section". That patch answers the question "How to avoid overlapping address ranges in case debug info referenced to deleted code presented". This patch makes sense not only for PRETEND. Thus, I do not agree with "If we have PRETEND, we will not need the -2 hack in relocateNonAlloc". Let`s not mixup issues.
@avl, something you might want to watch out for is the behaviour of llvm-dwarfdump --verify where this patching has happened. I've seen it spit out complaints about "invalid address ranges" for this situation.
@jhenderson Thank you for pointing this out! Actually --verify complains about errors in both cases(with the fix and without the fix) for the testcase from this review:
without this patch :
error: DIE has overlapping address ranges: [0x0000000000000000, 0x0000000000000006) and [0x0000000000000000, 0x0000000000000006)
error: DIEs have overlapping address ranges:
error: DIE has overlapping address ranges: [0x0000000000000000, 0x0000000000000006) and [0x0000000000000000, 0x0000000000000006)
error: DIEs have overlapping address ranges:
with this patch :
error: Invalid address range [0xfffffffffffffffe, 0x0000000000000004)
error: Invalid address range [0xfffffffffffffffe, 0x0000000000000004)
error: Invalid address range [0xfffffffffffffffe, 0x0000000000000004)
error: Invalid address range [0xfffffffffffffffe, 0x0000000000000004)
there is not any address range to which these debug_info sections relate. This address range [0x0000000000000000, 0x0000000000000006) should probably also be reported by --verify since it less than image_base...
Probably, without eliminating broken debug_info there is no way to verify it without errors?
PRETEND/CB_PRETEND
I thought CB_PRETEND could fix a "error on relocations to discarded SHF_ALLOC sections" issue but I fixed it in another way. Let's ignore my comment about CB_PRETEND.
error: Invalid address range [0xfffffffffffffffe, 0x0000000000000004)
DW_AT_low_pc DW_AT_high_pc pairs like these are less ideal. Before, tools have learned 0 address is special. They may or may not need a special case to handle this. Now, -2 is introduced (whatever value we choose here, GNU linkers still either relocate these to 0 or use their CB_PRETEND logic). They need to learn one more rule. I still believe, teaching the symbolizers (addr2line, llvm-symbolizer D60470) is the way forward.
the point is in that tools should NOT learn one more rule. There should not be done any additional logic for UINT64_MAX-1 in tools. the fact that llvm-dwarfdump --verify displays error message is probably normal. Because debug info references not existed address range. llvm-dwarfdump --verify dislays errors for current solution also. The proper way to fix llvm-dwarfdump --verify is to remove broken debug info.
I agree. As things stand, some tools have special case behaviour for 0, but many do not, and consequently get it wrong. Furthermore, there are some cases where it is impossible to determine whether 0 is special or not. For example for PIE output, 0 can be a valid address (not all linker scripts place the headers in memory), meaning the tools can't distinguish between a function at address 0, or a dead piece of debug information. The logic for -2 is simpler, because it is almost never the case that something will appear in memory at that location, so any logic required is much simpler. In those cases where there already is special handling, it is not much harder for -2 to be added as special and will work FAR more often than 0 will, whilst it keeps logic in tools that don't have 0 special-case handling simpler.
lld/test/ELF/gc-sections-debuginfo.s | ||
---|---|---|
41 ↗ | (On Diff #201495) | Nit: looks like there's a double blank line at EOF here? |
I disagree. For PIE output, 0 (__ehdr_start) points to the ELF header. It cannot be a valid code address. Note, the discussion is really about code addresses (DW_AT_low_pc, DW_AT_ranges, DW_AT_entry_pc, etc), not an unconstrained relocated file offset.
Not if the user is using linker scripts:
/* test.script */ PHDRS { ph_load PT_LOAD FLAGS(0x1|0x2|0x4); } SECTIONS { .text : {*(.text)} : ph_load .dynsym : { *(.dynsym) } }
bar.c:
int bar(int x) { x += 10000; return x ? x/12345 : 10; } void _start() { bar(1234); bar(4321); } int foo(int x, int y) { x += 10000 + y; y += x - 123456; bar(1234); bar(4321); foo(1234, 4321); foo(4321, 1234); return y ? x/12345 : 10; }
Command-line:
clang --target=x86_64-pc-linux bar.c -c -g -ffunction-sections ld.lld -pie bar.o -T test.script -o bar.elf --gc-sections
This results in the symbol bar being at address 0. The ELF header appears outside loadable memory. llvm-dwarfdump shows that 0 has been patched in the .debug_ranges field for both bar and foo. Running llvm-symbolizer and GNU addr2line produces mixed results:
$ llvm-symbolizer 0 0x40 0x50 0x80 0x100 -e bar.elf bar C:\Work\TempWork\bar.c:2:0 bar C:\Work\TempWork\bar.c:4:5 _start C:\Work\TempWork\bar.c:8:0 ?? C:\Work\TempWork\bar.c:25:12 ?? ??:0:0 $ addr2line 0 0x40 0x50 0x80 0x100 -e bar.elf -f bar C:\Work\TempWork/bar.c:14 bar C:\Work\TempWork/bar.c:19 _start C:\Work\TempWork/bar.c:22 foo C:\Work\TempWork/bar.c:25 ?? ??:0
Issues I noticed, some of which are possibly bugs, but it's hard to tell:
- llvm-symbolizer reports the line number corresponding to the removed foo for the fourth case.
- GNU addr2line line numbers are wrong in the first three cases, and foo is reported in the fourth.
- GNU addr2line thinks foo is the right symbol for the fourth case, even though it has been removed. llvm-symbolizer just prints '??' (that's probably correct).
I haven't checked to see how many of these issues are fixed by D60470.
Issues I noticed, some of which are possibly bugs, but it's hard to tell:
- llvm-symbolizer reports the line number corresponding to the removed foo for the fourth case.
- GNU addr2line line numbers are wrong in the first three cases, and foo is reported in the fourth.
- GNU addr2line thinks foo is the right symbol for the fourth case, even though it has been removed. llvm-symbolizer just prints '??' (that's probably correct).
I haven't checked to see how many of these issues are fixed by D60470.
With D60470, you still see the following for the fourth case:
?? C:\Work\TempWork\bar.c:25:12
It doesn't intend to fix this problem. What it tries to address, is if such discarded range overlaps with a valid normal range, the valid normal range will take priority. That change may also benefit ld.bfd and gold linked modules if their CB_PRETEND logic fails to apply. Because it also simplifies the code, I think it is still worth applying no matter the decision on this lld patch.
/* test.script */ PHDRS { ph_load PT_LOAD FLAGS(0x1|0x2|0x4); } SECTIONS { .text : {*(.text)} : ph_load .dynsym : { *(.dynsym) } }
I agree in this case .text can have 0 link-time address.
I guess you probably don't need .dynsym. If this is a dynamically linked executable, it will have .interp, which is usually in the first page. Then .text will not have 0 sh_addr, and 0 will be an invalid code address. It is not common to have .dynstr after .text - it is before .text in BFD's default linker script.
For similar reasons, there should not be any .gnu.hash .dynstr .rodata .eh_frame etc. They are placed before .text by default in lld.
If you have .note* sections, you'll probably also place them in the first page. This ensures they are dumped by FreeBSD and Linux kernels when the core file size is limited.
In summary, to make 0 a valid code address in an lld linked module, lots of assumptions have to be made. These assumptions make the lld patch seem pretty much a hack for a very specialized use case.
I agree in this case .text can have 0 link-time address.
I guess you probably don't need .dynsym. If this is a dynamically linked executable, it will have .interp, which is usually in the first page. Then .text will not have 0 sh_addr, and 0 will be an invalid code address. It is not common to have .dynstr after .text - it is before .text in BFD's default linker script.
For similar reasons, there should not be any .gnu.hash .dynstr .rodata .eh_frame etc. They are placed before .text by default in lld.
If you have .note* sections, you'll probably also place them in the first page. This ensures they are dumped by FreeBSD and Linux kernels when the core file size is limited.
In summary, to make 0 a valid code address in an lld linked module, lots of assumptions have to be made. These assumptions make the lld patch seem pretty much a hack for a very specialized use case.
Having 0 as a valid code address is quite common for embedded world.
Please check that article, it describes exactly such a case https://www.embeddedrelated.com/showarticle/900.php
The linker script for that example has
MEMORY { flash (rx) : ORIGIN = 0, LENGTH = 150K ram (rw!x) : ORIGIN = 0x800000, LENGTH = 32K } SECTIONS { .text : { ./crt0.o(.text*) *(.text*) *(.strings) *(._pm*) *(.init) *(.fini) _etext = . ; . = ALIGN(4); } > flash
Other than that there were another mentions of " 0 as a valid code address" in this thread. Sony has such a case. Message from lldb-dev mentioned that: http://lists.llvm.org/pipermail/lldb-dev/2017-March/012091.html . Having all of that, "0 as a valid code address" does not look like very specialized use case.
It would be useful to support it, especially for embedded applications.
I am following, I've just been out for 3 weeks. I'm catching up now and this is in my queue.
Folks, I would like to ask a question related to this review and DWARF standard:
DWARF5 standard has the following statement related to non-continuous address ranges:
2.17.3 Non-Contiguous Address Ranges
...
A bounded range entry whose beginning and ending address offsets are equal
(including zero) indicates an empty range and may be ignored.
Would not it be useful to extend it to explicitly explain the following case
(for some next version of the standard)
"A bounded range entry whose beginning address offset greater than ending address offset
indicates an invalid range and may be ignored. "
?
This situation (LowPC > HighPC) already represents an invalid range.
llvm-dwarfdump --verify :
error: Invalid address range [0xfffffffffffffffe, 0x0000000000000004)
Thus it would be useful to indicate that explicitly and describe that default handling would be to skip.
So that broken ranges do not affect valid ranges.
For this patch, it would mean that practically all cases
would lead in this situation, and DWARF readers would adequately skip them :
LowPC -> points to deleted section -> set to UINT64_MAX-1; HighPC -> points to deleted section -> set to UINT64_MAX-1; ignored.
LowPC -> points to deleted section -> set to UINT64_MAX-1; HighPC -> points to correct section -> set to some value less than UINT64_MAX-1; ignored.
LowPC -> points to deleted section -> set to UINT64_MAX-1; Length has some value -> HighPC = UINT64_MAX-1 + value -> overflow -> less than LowPC ignored.
What do you think?
rebased. updated according to the discussion on llvm-dev.
It is currently being discussed, so it could be changed more.
(again, thanks always for your ongoing patience on this complicated/long-term issue)
The flag name for disabling this seems a little confusing to me - since there's no option to really "disable" this sort of marking, but different choices in what value is used. So maybe something like "--dead-reloc-zero-addend" or something (don't go changing the patch on my bad suggestion here, but perhaps @MaskRay & others on this thread might have better ideas for names based on linker flag name history). I wouldn't worry about debating/changing this urgently - just a thought to ponder while we continue with the rest of the discussion.
Perhaps if we're adding the option, the option value should be configurable, so something like --dead-reloc-addend=<value> which specifies the value instead of just assuming zero. You could even go further e.g. --dead-reloc-addend=<section>=<value> (your syntax may vary) to allow configuring it per section. Thus, the default would be something like --dead-reloc-addend=.debug_ranges=0xfffffffffffffffe --dead-reloc-addend=.debug_info=0xffffffffffffffff and so on. This would allow users to configure things according to the needs of their individual system.
Doable if necessary (will you likely have a need for this in PS4? that'd make it easier to justify adding the feature) - but I'd try to avoid adding such flexibility for as long as possible to avoid proliferation of even more variants of debug info consumers might then have to deal with.
As far as I know, we don't have any need to change from the -2/-1 values, so no specific justification from that point of view. I'm mostly thinking of weird cases where users might have to use addresses in the space we're using for whatever reason (signed addresses/wrapped address space etc).
Do we have agreement on following option syntax;
--dead-reloc-addend=<value>
?
If yes - I would update the patch.
I'd like to avoid adding a flag at all, if possible.
Is there already a clear case where a one-size-fits-all solution won't be sufficient?
I'd like to avoid adding a flag at all, if possible.
without the flag there is a risk that some tools expecting 0(current behavior) would work incorrectly with -1/-2.
I do not know whether that problem really exist. But, since we are changing current default behavior, that problem could arise.
Thus, the flag which could restore old behavior - looks useful.
Is there already a clear case where a one-size-fits-all solution won't be sufficient?
I did not clearly get it. is "one-size-fits-all" means option which affects all tables(without special cases for debug_loc/debug_ranges)?
Yeah, I just worry then it becomes another source of inconsistency between consumers/producers/etc. Perhaps we could name it in some way that's clear that it might be removed later - to encourage people to either reach out to us to explain their use case, or to invest a bit more in working with the new default behavior.
Is there already a clear case where a one-size-fits-all solution won't be sufficient?
I did not clearly get it. is "one-size-fits-all" means option which affects all tables(without special cases for debug_loc/debug_ranges)?
Sorry, by "one-size-fits-all" I just meant "without a flag to customize the behavior". ie: do we know of specific use cases where the proposed -1/-2 behavior is definitely not going to be usable (that would make the flag discussion easier/more clear)?
Yeah, I just worry then it becomes another source of inconsistency between consumers/producers/etc. Perhaps we could name it in some way that's clear that it might be removed later - to encourage people to either reach out to us to explain their use case, or to invest a bit more in working with the new default behavior.
--dead-reloc-obsolete-addend
or
--dead-reloc-obsolete
? (yes/no option).
Sorry, by "one-size-fits-all" I just meant "without a flag to customize the behavior". ie: do we know of specific use cases where the proposed -1/-2 behavior is definitely not going to be usable (that would make the flag discussion easier/more clear)?
I see, now. Thank you. I do not know such cases.
Not really sure - sometimes flags use the word "experimental" but that more sounds like it might become a real flag in the future, whereas the hope is that it's the other direction. The only other mechanism I've seen is the really verbose flag " -enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang" - but I don't think that's really necessary here. Probably "experimental" (eg: --experimental-dead-reloc (I think 'experimental' usually goes at the start - but you could check other flags for inspiration)) is enough, if we are going to have a flag at all.
Answering this very old question:
Case 1 is a zero-length range and would always be interpreted that way.
Case 2 should not occur, I think; the compiler should not produce a range where low/high point to different sections.
Case 3 we are hoping will be detected and treated as an invalid range by consumers.
There is an additional case to consider, which is this: Ranges can be specified (and in LLVM's case, now usually are) using a "base address" and then the low/high values are *both* offsets from that base address. We would want consumers to notice the base address was a special "invalid" or "points to nothing" value, and ignore any range that used low/high offsets from that base address.
Case 2 should not occur, I think; the compiler should not produce a range where low/high point to different sections.
It could occur when discarded section followed with the valid section.
The range for discarded section is [LowPC,HighPC). HighPC points to the next byte after specified range.
Thus HighPC could be valid address - LowPC of next range.
Case 3 we are hoping will be detected and treated as an invalid range by consumers.
Probably, it makes sense to describe it directly in the DWARF standard?
The same how it is already spoken for the zero-length ranges:
"A bounded range entry whose beginning and ending address offsets are equal
(including zero) indicates an empty range and may be ignored."
For LowPC>HighPC it could sound similarly:
"A bounded range entry whose beginning address offset greater than ending address offset
indicates an invalid range and may be ignored."
Thus, above situation would be explicitly described and then should be processed accordingly by consumers.
There is an additional case to consider, which is this: Ranges can be specified (and in LLVM's case, now usually are) using a "base address" and then the low/high values are *both* offsets from that base address. We would want consumers to notice the base address was a special "invalid" or "points to nothing" value, and ignore any range that used low/high offsets from that base address.
yes. So we need more rules: Base address is being marked with -1/-2 according to last discussions.
So DWARF standard should explicitly describe that all ranges using base address of -1/-2 value become invalid.
Right?
It's /possible/ for a compiler to use such code to encode a low/high range, but seems quite unlikely that a compiler would do that in practice (compared to using an offset relative to the section the code is in). I think it'd be OK to ignore that case for now/suggest that if a compiler does such a thing it's up to that compiler to figure out how to get the tombstone value.
But I'm also OK saying that if either the low or high pc have the tombstone value, the range is invalid - so a produce could, in theory do what you're describing. But seems unlikely they ever would.
Case 3 we are hoping will be detected and treated as an invalid range by consumers.
Probably, it makes sense to describe it directly in the DWARF standard?
The same how it is already spoken for the zero-length ranges:"A bounded range entry whose beginning and ending address offsets are equal
(including zero) indicates an empty range and may be ignored."For LowPC>HighPC it could sound similarly:
"A bounded range entry whose beginning address offset greater than ending address offset
indicates an invalid range and may be ignored."Thus, above situation would be explicitly described and then should be processed accordingly by consumers.
There is an additional case to consider, which is this: Ranges can be specified (and in LLVM's case, now usually are) using a "base address" and then the low/high values are *both* offsets from that base address. We would want consumers to notice the base address was a special "invalid" or "points to nothing" value, and ignore any range that used low/high offsets from that base address.
yes. So we need more rules: Base address is being marked with -1/-2 according to last discussions.
So DWARF standard should explicitly describe that all ranges using base address of -1/-2 value become invalid.
Right?
I think the thing for the DWARF spec to do would be to say a fundamental address of -1 (the DWARFv6 spec won't need to talk about -2, because that'll just be a compatibility issue for DWARFv4 and below - and DWARF doesn't issue fixes to old specs - so it'll be just an implementation detail between DWARF producers and consumers) is the tombstone/reserved value to describe dead code and that any arithmetic on that value is ignored (so you can't use ((address)-1) + 5 to reach a valid address -1 remains -1 or otherwise tracks invalid through any arithmetic).
How about we just go with the basic --dead-reloc-addend (or similar) with no experimental or obsolete in the name? In that case, I'd make it hidden, so users can't find it in the normal help text, and add comments both in the help text (for --help-hidden, and comments in the code) and in the release notes saying it is intended as a chicken switch for compatibility reasons, and that they should contact the LLVM community if the switch is ever actually needed, but if we don't hear we will remove it in the future.
Not sure how many people discover these things by reading the manual or release notes - once it's on a stackoverflow post all context other than the name is probably lost.
But I don't feel too strongly about this if everyone else doesn't.
I've filed a DWARF issue to reserve "-1" as the tombstone address value. I wrote it up such that consumers should ignore that address and any address derived from it (e.g., base+offset). I don't have the issue number for it yet, hopefully I'll remember to post it here when it's assigned.
Using -2 for .debug_loc/.debug_ranges would have to be a convention, but is entirely "legal" from a DWARF standards perspective.
Thanks so much for starting the discussion/process with the DWARF Committee! (I guess once it's got an assigned number you could link it here & we can go CC ourselves ,maybe?)
If we can avoid --dead-reloc-addend, avoid it.
If we have to add an option, I like --dead-reloc-addend=section_glob=signed_or_unsigned (http://lists.llvm.org/pipermail/llvm-dev/2020-June/142073.html )
"My mere complaint is that the relocation record is not dead, but rather its referenced symbol is dead. However, I can't think of a better name..."
One thought when I started the thread (with the same subject as the llvm-dev's) on binutils, gdb and elfutils-devel (https://sourceware.org/pipermail/binutils/2020-May/111357.html ) was to collect use cases where -1/-2 may potentially be troublesome. (I haven't seen any reply suggesting that -1/-2 may be problematic yet.)
It looks like most common opinion is to not to create an option. Will remove it and update the patch.
Ah @MaskRay posted the link to the DWARF issue ( http://www.dwarfstd.org/ShowIssue.php?issue=200609.1 ) in another thread. In lieu of somewhere better to discuss it, just a couple of points to make, @probinson :
Currently if two inline functions are /identical/, then at least with the unix linkers I know of (bfd, gold, lld), to the best of my knowledge - the linker resolves relocations to either original copy to the final copy - so both CUs point to the same function. (with the complications as you pointed out of overlapping aranges, CU ranges, etc)
If the inline functions aren't identical, then the preserved copy is only pointed to by the DWARF that originally described it - that's important because different optimizations on the function could invalidate the DWARF (eg: one file claims the variable A is in register B, the other claims it is in register C - if both subprogram descriptions point to the final copy, one of them will be incorrect)
As for Identical Code Folding - I think it might actually be valuable to preserve that information and have both subprograms point to the same code (even with overlapping aranges/CU ranges/etc). But I could understand if that's a problem for some consumers - probably the sort of thing I'd be inclined to/hope DWARF allows, even if it suggests some implementations might choose not to take advantage of it.
(summary-ish:
comdat functions that are optimized differently: subprograms must not refer to the differently optimized versio
comdat functions that are optimized identically: multiple subprograms could point to the same code, but there's probably little benefit/reason to do so
identical unrelated functions: there's value in multiple subprograms pointing to the same code)
Wording-wise: I'd have thought DWARF would be a bit more generic rather than talking about how producers might produce code, or how linkers might be involved in that process - just saying formally "-1 based addresses should be treated the same way as if the attribute were not present" - if a producer, for whatever reason, wants to produce a -1 address up-front rather than only by the linker, doesn't seem like DWARF shuold care/have an opinion on that.
I'm not aware of that other thread.
In lieu of somewhere better to discuss it, just a couple of points to make, @probinson :
Currently if two inline functions are /identical/, then at least with the unix linkers I know of (bfd, gold, lld), to the best of my knowledge - the linker resolves relocations to either original copy to the final copy - so both CUs point to the same function. (with the complications as you pointed out of overlapping aranges, CU ranges, etc)
If the inline functions aren't identical, then the preserved copy is only pointed to by the DWARF that originally described it - that's important because different optimizations on the function could invalidate the DWARF (eg: one file claims the variable A is in register B, the other claims it is in register C - if both subprogram descriptions point to the final copy, one of them will be incorrect)
As for Identical Code Folding - I think it might actually be valuable to preserve that information and have both subprograms point to the same code (even with overlapping aranges/CU ranges/etc). But I could understand if that's a problem for some consumers - probably the sort of thing I'd be inclined to/hope DWARF allows, even if it suggests some implementations might choose not to take advantage of it.(summary-ish:
comdat functions that are optimized differently: subprograms must not refer to the differently optimized versio
comdat functions that are optimized identically: multiple subprograms could point to the same code, but there's probably little benefit/reason to do so
identical unrelated functions: there's value in multiple subprograms pointing to the same code)
No dispute with any of the above. In some respects, yeah I glossed over some things or got details wrong, but the point is: There used to be more copies of a thing, now there are fewer, what is appropriate for DWARF to say about that? Allowing a more efficient way to say "not here boss" is what matters.
Wording-wise: I'd have thought DWARF would be a bit more generic rather than talking about how producers might produce code, or how linkers might be involved in that process - just saying formally "-1 based addresses should be treated the same way as if the attribute were not present" - if a producer, for whatever reason, wants to produce a -1 address up-front rather than only by the linker, doesn't seem like DWARF shuold care/have an opinion on that.
DWARF already acknowledges the existence of COMDAT and linkers, for purposes of motivating type sections. A similar motivation seems necessary here. But as you say, it's probably not necessary to intrude that into the formal specification, and just have it say there are two ways to say it's not there (omit the attributes, or use the tombstone). I thought about moving that part out into nonnormative and probably should have followed that impulse.
If you think there would be real value to it (and knowing that the DWARF committee is not actively meeting at the moment, so it will be a while before this gets discussed), I can resubmit the issue with a cleaner writeup and mark the current one as superseded.
https://sourceware.org/pipermail/binutils/2020-June/111640.html I will share results here or our llvm-dev thread if there is useful feedback.
In lieu of somewhere better to discuss it, just a couple of points to make, @probinson :
Currently if two inline functions are /identical/, then at least with the unix linkers I know of (bfd, gold, lld), to the best of my knowledge - the linker resolves relocations to either original copy to the final copy - so both CUs point to the same function. (with the complications as you pointed out of overlapping aranges, CU ranges, etc)
If the inline functions aren't identical, then the preserved copy is only pointed to by the DWARF that originally described it - that's important because different optimizations on the function could invalidate the DWARF (eg: one file claims the variable A is in register B, the other claims it is in register C - if both subprogram descriptions point to the final copy, one of them will be incorrect)
As for Identical Code Folding - I think it might actually be valuable to preserve that information and have both subprograms point to the same code (even with overlapping aranges/CU ranges/etc). But I could understand if that's a problem for some consumers - probably the sort of thing I'd be inclined to/hope DWARF allows, even if it suggests some implementations might choose not to take advantage of it.(summary-ish:
comdat functions that are optimized differently: subprograms must not refer to the differently optimized versio
comdat functions that are optimized identically: multiple subprograms could point to the same code, but there's probably little benefit/reason to do so
identical unrelated functions: there's value in multiple subprograms pointing to the same code)No dispute with any of the above. In some respects, yeah I glossed over some things or got details wrong, but the point is: There used to be more copies of a thing, now there are fewer, what is appropriate for DWARF to say about that? Allowing a more efficient way to say "not here boss" is what matters.
// clang -g -c a.c // ld.lld --icf=all a.o void foo() {} // .text.foo void bar() {} // .text.bar , folded by ICF
Resolving a relocation referencing .text.bar to the tombstone value will make .debug_* diverge more from other non-SHF_ALLOC sections. It is not difficult to implement, though:
if (Defined *ds = dyn_cast<Defined>(&sym)) ds->section->repl == ds->section => (not folded by ICF)
Wording-wise: I'd have thought DWARF would be a bit more generic rather than talking about how producers might produce code, or how linkers might be involved in that process - just saying formally "-1 based addresses should be treated the same way as if the attribute were not present" - if a producer, for whatever reason, wants to produce a -1 address up-front rather than only by the linker, doesn't seem like DWARF shuold care/have an opinion on that.
DWARF already acknowledges the existence of COMDAT and linkers, for purposes of motivating type sections. A similar motivation seems necessary here. But as you say, it's probably not necessary to intrude that into the formal specification, and just have it say there are two ways to say it's not there (omit the attributes, or use the tombstone). I thought about moving that part out into nonnormative and probably should have followed that impulse.
If you think there would be real value to it (and knowing that the DWARF committee is not actively meeting at the moment, so it will be a while before this gets discussed), I can resubmit the issue with a cleaner writeup and mark the current one as superseded.
For sure!
Wording-wise: I'd have thought DWARF would be a bit more generic rather than talking about how producers might produce code, or how linkers might be involved in that process - just saying formally "-1 based addresses should be treated the same way as if the attribute were not present" - if a producer, for whatever reason, wants to produce a -1 address up-front rather than only by the linker, doesn't seem like DWARF shuold care/have an opinion on that.
DWARF already acknowledges the existence of COMDAT and linkers, for purposes of motivating type sections. A similar motivation seems necessary here. But as you say, it's probably not necessary to intrude that into the formal specification, and just have it say there are two ways to say it's not there (omit the attributes, or use the tombstone). I thought about moving that part out into nonnormative and probably should have followed that impulse.
If you think there would be real value to it (and knowing that the DWARF committee is not actively meeting at the moment, so it will be a while before this gets discussed), I can resubmit the issue with a cleaner writeup and mark the current one as superseded.
Yeah, I doubt it's worth resubmitting or anything, totally your call on whether you think the finer points will be a distraction or not (you know the folks on the committee/kind of conversational dynamics/etc far better than I do) - probably something to keep in mind when it comes to finalizing the wording.
(summary-ish: comdat functions that are optimized differently: subprograms must not refer to the differently optimized version
comdat functions that are optimized identically: multiple subprograms could point to the same code, but there's probably little benefit/reason to do so
identical unrelated functions: there's value in multiple subprograms pointing to the same code)
Resolving a (relocation referencing a symbol defined in a discarded COMDAT group) to the tombstone value may be more troublesome.
clang -c -ffunction-sections a.cc # no debug info clang -c -ffunction-sections -g b.cc
Assuming a.cc & b.cc define the same function, should the linker resolve relocations in b.o(.debug_info) to the tombstone value? Maybe not because we'll lose all debugging information.
If both a.o & b.o have debugging information, it may be reasonable to resolve relocations in b.o(.debug_info) to the tombstone value.
if it comes to that, yeah, it would be nice-to-have if one debug info copy got the non-tombstone value, but I wouldn't stress about that - it's a pretty narrow case, and would only apply with a partial-debug build (we have a lot of debug info optimizations that are on-by-default (except for lldb) that assume the whole program is built with debug info).
FWIW, I wonder whether the test case would be better written with YAML rather than assembly input. It would allow for more explicitly specification of what relocations are emitted, and would allow reusing the same yaml blob using the -D option for the differences between the 32-bit and 64-bit cases, relocation types etc.
lld/ELF/InputSection.cpp | ||
---|---|---|
845–846 | returns a special value | |
852–853 | I might well have missed something, but why this special casing specifically for X86_64? Do other 64-bit targets not have the same issue (I'm assuming the masking is required to prevent some sort of "too big to fit" error from the relocation processing). It also looks like some of this if statement is untested (specifically the bit about EM_X86_64). | |
873 | I struggled to interpret this variable name without looking at how it is used. Perhaps noPrefixName might be better. | |
932–933 | to resolve such relocation -> to resolve it | |
lld/test/ELF/gc-sections-debuginfo.s | ||
3 ↗ | (On Diff #269874) | Is there a reason you are echoing this content, rather than having it written in the test file like most tests (I assume it's the need for two different blocks of assembly)? If there is, then the test file should end with .test, rather than .s since this file currently contains no assembly. |
4 ↗ | (On Diff #269874) | Use different file names for each of the test cases, e.g. %t.x64 and %t.i386 |
18 ↗ | (On Diff #269874) | I'd move this whole comment to the top of the file to show what is being tested. I'd also change the debuginfo part of the test name to debug-data (i.e. gc-sections-debug-data.s). Actually, this isn't specific to --gc-sections right? The same code applies for discarded COMDATs, if I'm not mistaken? So should that be tested too? |
22 ↗ | (On Diff #269874) | to an predefined -> to a predefined |
23 ↗ | (On Diff #269874) | That test -> This test |
24 ↗ | (On Diff #269874) | would be -> are |
25 ↗ | (On Diff #269874) | . And -> and We should explicitly test .debug_ranges as well as .debug_loc, since they are both special cased. |
26 ↗ | (On Diff #269874) | would be -> are |
64–65 ↗ | (On Diff #269874) | Nit: there's still a double blank line at EOF here that should be deleted. |
4 ↗ | (On Diff #191909) |
Hehe, amusing self-note looking back at this comment - I nowadays prefer the pattern the other way around, namely end one line with | \ and start the next line with a couple of extra spaces of indentation. I'm happy whichever way you wish to do it. |
would address all comments and rewrite test case with yaml.
lld/ELF/InputSection.cpp | ||
---|---|---|
852–853 |
right. lld reported "too big to fit" error. X86_64::relocate() function has following check: checkUInt(loc, val, 32, rel); It leads to this error for ILP32 case: relocation R_X86_64_32 out of range: 18446744073709551615 is not in [0, 4294967295]; consider recompiling with -fdebug-types-section to reduce size of debug sections. I checked these targets and they seem do not have above problem: aarch64 But , I missed sparcv9 target. It has similar problem.
It assumed to be tested with following line from test file:
specifying x86_64 inside triple should lead to config->emachine = EM_X86_64, as far as I understand. | |
lld/test/ELF/gc-sections-debuginfo.s | ||
3 ↗ | (On Diff #269874) | Originally I wrote this test as two files with assembly. But there was request to put it`s content into "echo". Would rewrite it with yaml, as requested. |
I think assembly is more readable than YAML. The assembly directive .reloc 0, R_X86_64_64, sym can encode arbitrary relocations.
addressed comments.
(not used yaml, since there is a request to use asm.
added tests for various architectures, not added test for COMDAT
sections yet)
I know that you have posted the patch for one year and spent much time on it. It would seem that I deprived the patch if I created a new one. Apologies that I just created D81784 to hopefully capture all the previous discussions. A bit more items why I started a new one:
- We don't need to special case .zdebug_* => actually we are thinking of dropping input .zdebug_* support (I am sorry that our internal toolchain still depends on it, so I can't remove it even if the community feels that dropping it is ok).
- R_X86_64_32 should not be checked. We should use target->symbolicRel
- We don't need *-gc-sections-debug-data-64.s for every target. The feature is sufficiently generic - x86_64 + x86 covers most cases. symbolRel logic has separately be checked in many tests.
- gc-sections is not a generic name. /DISCARD/, non-prevailing section group, ICF can create similar scenarios. We should use a generic test case.
- I want to explicitly check that addend is ignored.
- I want to a more comprehensive summary and call out to what is fixed and what isn't (non-prevailing section group).
returns a special value
indicate a reference from a debug