This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lld/
-
ELF/
14/17
InputSection.cpp
-
test/ELF/
-
ELF/
3/3
debug-dead-reloc-32.s
4/5
debug-dead-reloc-icf.s
2/4
debug-dead-reloc.s

Differential D81784

[ELF] Resolve relocations in .debug_* referencing (discarded symbols or ICF folded section symbols) to tombstone values
ClosedPublic

Authored by MaskRay on Jun 13 2020, 12:30 AM.

Download Raw Diff

Details

Reviewers

• espindola
dblaikie
probinson
avl
jhenderson
grimar
psmith

Commits

rGe618ccbf431f: [ELF] Resolve relocations in .debug_* referencing (discarded symbols or ICF…

Summary

See D59553, https://lists.llvm.org/pipermail/llvm-dev/2020-May/141885.html and
https://sourceware.org/pipermail/binutils/2020-May/111357.html for
extensive discussions on a tombstone value.
See http://www.dwarfstd.org/ShowIssue.php?issue=200609.1
(Reserve an address value for "not present") for a DWARF enhancement proposal.

We resolve such relocations to a tombstone value to indicate that the address is invalid.
This solves several problems (the normal behavior is to resolve the relocation to the addend):

For an empty function in a collected section, a pair of (0,0) can terminate .debug_loc and .debug_ranges (as of binutils 2.34, GNU ld resolves such a relocation to 1 to avoid the .debug_ranges issue)
If DW_AT_high_pc is sufficiently large, the address range can collide with a regular code range of low address (https://bugs.llvm.org/show_bug.cgi?id=41124 )
If a text section is folded into another by ICF, we may leave entries in multiple CUs claiming ownership of the same range of code, which can confuse consumers.
Debug information associated with COMDAT sections can have problems similar to ICF, but is more complex - thus not addressed by this patch.

For pre-DWARF-v5 .debug_loc and .debug_ranges, a pair of 0 can terminate
entries (invalidating subsequent ranges).
-1 is a reserved value with special meaning (base address selection entry) which can't be used either.
Use -2 instead.

For all other .debug_*, use UINT32_MAX for 32-bit targets and UINT64_MAX
for 64-bit targets. In the code, we intentionally use
uint64_t tombstone = UINT64_MAX for 32-bit targets as well: this matches
SignExtend64 as used in relocateAlloc. (Actually UINT32_MAX does not work for R_386_32)

Note 0, we only special case target->symbolicRel (R_X86_64_64, R_AARCH64_ABS64, R_PPC64_ADDR64), not
short-range absolute relocations (e.g. R_X86_64_32). Only forms like DW_FORM_addr need to be special cased.
They can hold an arbitrary address (must be 64-bit on a 64-bit target). (In theory,
producers can make use of small code model to emit 32-bit relocations. This doesn't seem to be leveraged.)

Note 1, we have to ignore the addend, because we don't want to resolve
DW_AT_low_pc (which may have a non-zero addend) to -1+addend (wrap
around to a low address):

__attribute__((section(".text.x"))) void f1() { }
__attribute__((section(".text.x"))) void f2() { } // DW_AT_low_pc has a non-zero addend

Note 2, if the prevailing copy does not have debugging information while
a non-prevailing copy has (partial debug build), we don't do extra work
to attach debugging information to the prevailing definition. (clang
has a lot of debug info optimizations that are on-by-default that assume
the whole program is built with debug info).

clang -c -ffunction-sections a.cc    # prevailing copy has no debug info
clang -c -ffunction-sections -g b.cc

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

MaskRay created this revision.Jun 13 2020, 12:30 AM

Herald added a reviewer: • espindola. · View Herald TranscriptJun 13 2020, 12:30 AM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: llvm-commits, arichardson, aprantl, emaste. · View Herald Transcript

MaskRay edited the summary of this revision. (Show Details)Jun 13 2020, 12:32 AM

MaskRay edited the summary of this revision. (Show Details)Jun 13 2020, 12:39 AM

Herald added a subscriber: kristof.beyls. · View Herald TranscriptJun 13 2020, 12:39 AM

Update comment

MaskRay added reviewers: dblaikie, probinson, avl, jhenderson, grimar, psmith.Jun 13 2020, 12:43 AM

MaskRay mentioned this in D59553: [WIP][LLD][ELF][DebugInfo] Use predefined value to mark debug address ranges pointing to deleted code..Jun 13 2020, 12:52 AM

Harbormaster completed remote builds in B60201: Diff 270566.Jun 13 2020, 1:02 AM

Harbormaster completed remote builds in B60202: Diff 270567.Jun 13 2020, 1:34 AM

I only partially followed the discussion on llvm-dev. The resolution sounds sensible to me. Will be good to get someone involved in the upstream discussion to check it over.

Not related to this patch:
FWIW Arm's proprietary linker has a relatively simple garbage collection algorithm for debug sections. Debug sections were partitioned into two sets, the first set contained relocations to non-debug sections, the second set did not but were reachable via relocations from the first set. After gc of non-debug sections, the members of the first set were marked live if any of the non-debug sections they referenced were marked live. We'd then traverse the relocations from the live members of set 1 to find the live members of set 2. This worked well for Arm's proprietary compiler as it put quite a lot of effort into generating separating out debug sections per functions. It wasn't perfect and there were .debug_info sections that referenced multiple independent .text sections, but did permit quite a lot of debug to be removed without having to parse the dwarf.

jhenderson added inline comments.Jun 17 2020, 12:52 AM

lld/ELF/InputSection.cpp
930	Nit: spurious space before the full stop.
lld/test/ELF/debug-dead-reloc-32.s
3	tombstone -> tombstone value Here and in other tests.
18	Is the 'e' for SHF_EXCLUDE? If so, do we actually want it and --gc-sections? Seems like one or the other would be sufficient to me.
lld/test/ELF/debug-dead-reloc-icf.s
3–6	One issue that I don't recall whether it was brought up previously: if we resolve one to -1, and the other has no debug data (e.g. they came from two different CUs, only one of which was compiled with -g), we could actually lose debug data for the function completely. I don't know what the right thing to do is in this situation, as I don't know whether it's better than the risk of confusing consumers. I don't think we can reasonably attempt to identify whether there is matching debug data for the kept version after all.
13	Perhaps it would be better to spell out all the zeros for the first relocation target, as it was a bit confusing seeing one set of 8 followed by a set of 16.

dblaikie added inline comments.Jun 17 2020, 10:02 AM

lld/test/ELF/debug-dead-reloc-icf.s
3–6	Yeah, I'd accept that as an existing problem/not one that I think this tombstone work should necessarily address - the old solution or the new solution would be compatible with a linker making a more nuanced approach to this problem. (but, in that separate discussion: I'm not sure this would be appropriate to fix - fixing this, I think, would mean having debug info influence which version of a function the linker chooses - increasing the risk of heisenbugs)

Address comments.

Add Note 2 to the description.

lld/ELF/InputSection.cpp
930	I tend to not write `.debug_ranges.` I will add `sections`
lld/test/ELF/debug-dead-reloc-32.s
18	I'll drop --gc-sections.
lld/test/ELF/debug-dead-reloc-icf.s
3–6	https://reviews.llvm.org/D59553#2083857 My understanding is that: the prevailing definition should have associated debug info. If not, the linker should not do extra work to attach debug info from other copies to the prevailing definition.

Fix debug-dead-alloc*.s

Harbormaster failed remote builds in B60680: Diff 271438!Jun 17 2020, 12:58 PM

Harbormaster completed remote builds in B60681: Diff 271439.Jun 17 2020, 1:30 PM

LGTM, but I'd want @avl to be happy with it, since he did the original work essentially.

lld/test/ELF/debug-dead-reloc-icf.s
3–6	FWIW, I looked at our properietary linker and downstream LLD ports, both of which have the -1 patching behaviour implemented, and we don't handle the "some with, some without" case at all, so ICF folded symbols can theoretically lose debug data, same as is implemented in this patch. So from our point of view, there's no regression, just a continued space from improvement.

This revision is now accepted and ready to land.Jun 18 2020, 12:41 AM

I am happy with it. There is a small comment inside. other than that lgtm.

lld/ELF/InputSection.cpp
932	I think it is better not to do string comparisons for every relocation pointing to deleted code. It is worth to move this name comparisons before the loop.

avl accepted this revision.Jun 18 2020, 4:14 AM

jhenderson mentioned this in D79962: Fix the verification of DIEs with DW_AT_ranges..Jun 18 2020, 5:56 AM

MaskRay marked 5 inline comments as done.Jun 18 2020, 12:16 PM

MaskRay added inline comments.

lld/ELF/InputSection.cpp
932	I think it is actually more efficient doing string comparisons here. The tombstone value code path is rare. By doing string comparisons here, we don't perform string comparisons in the normal code paths.

dblaikie added inline comments.Jun 18 2020, 12:56 PM

lld/ELF/InputSection.cpp
932	But you have to do it repeatedly for every relocation in this section, yes? - and .debug_loc and .debug_ranges may contain many relocations. The test for "is this a debug section" happens unconditionally once, outside the relocation application loop - it seems like this test could/should happen in the same place (& it can be conditional on "is this a debug section": ` bool isDebugLocOrRanges = isDebugLoc && name == ".debug_loc" \|\| name == ".debug_ranges"; ` Seems like it'd be cheap enough (known short length, the string's already hot in the cache, etc). (alternatively the test could be computed lazily and cached - but given "isDebug" isn't done that way I don't think this one should be either)

Hoist isDebugLocOrRanges

MaskRay marked an inline comment as done.Jun 18 2020, 1:21 PM

Harbormaster completed remote builds in B60889: Diff 271826.Jun 18 2020, 2:47 PM

Some optional code layout and comments - but all good either way/whatever you find most readable/helpful.

lld/ELF/InputSection.cpp
908	Probably not a big deal, but curious to do this test when it might not be needed. Ah, partly because fo the if/else if/else structure - if this used early-return instead, perhaps it would be easier to write it like this: if (sym.isTls() && !Out::tlsPhdr) { target->relocateNoSym(bufLoc, type, 0); continue; } if (isDebug && type == target->symbolicRel) { auto *ds = dyn_cast<Defined>(&sym); if (!sym.getOutputSection() \|\| (ds && ds->section->repl != ds->section)) { target->relocateNoSym(bufLoc, type, isDebugLocOrRanges ? UINT64_MAX - 1 : UINT64_MAX); continue; } } target->relocateNoSym(bufLoc, type, SignExtend64<bits>(sym.getVA(addend)));
913–917	"normal behavior" might be overstating it - it was the behavior of lld and is the behavior of gold, but isn't the behavior of bfd.ld. maybe "Other options such as zero (1 for debug_ranges) (bfd.ld) or zero+addend (gold) result in overlaps with valid low-address ranges, or deeper problems terminating debug_loc and debug_range lists early"?
920–922	I think you can omit the "DW_AT_low_pc (which may have a non-zero addend)" part of this, perhaps? In general we don't want -1+addend to wrap around into a valid address range, no matter which attribute it might be on.
929–930	Might clarify "a pair of 0 (0+addend could cause this when handling an empty function)" (though perhaps that deosn't need to be called out here - since we already motivated this earlier in the comment talking about low or zero addresses - so only the -1 case needs to be justified/explained why it can't use the same solution as all the other .debug_ sections being handled here) Probably "the entries and -1" rather than a comma there - since there's only two things in this list. (& maybe the end of this "Use -2 instead" might not stand on its own as a sentence - "... so -2 is used."
lld/test/ELF/debug-dead-reloc.s
38	So even if the comdat group is identical, this will resolve only one of them & use -1 for the other? I think that's a regression/separate change that probably should have a sepraate discussion compared to the -1 rather than 0+addend change. Seems there could be some value in resolving all the function descriptions to the same code where possible - but I also see the problem with having CUs ranges overlap (which I think @probinson is especially concerned about). So the change may well be merited, but I think deserves a separate review/discussion. (eg: currently... oh, bfd and gold currently do the "point all the DWARF to the one definition" but it seems lld doesn't do that and, as you say, already does zero (well, zero+addend) out the duplicate references to the same deduplicated comdat) Fair enough then... curious, but not a change in behavior, so guess we slipped that in somewhere along the way & hasn't broken anything that I've heard.

Address comments

lld/ELF/InputSection.cpp
908	Thanks for the suggestion! I thought about the additional test a normal code path would run. It seems that you're also concerned - so I'll refactor the if else if else structure as you suggested to avoid an unneeded dyn_cast.
913–917	I'll just say "Resolving to addend". I may send a patch to GNU ld, and a statement about GNU ld's behavior may be soon outdated.
lld/test/ELF/debug-dead-reloc.s
38	A relocation referencing a local symbol defined in a non-prevailing comdat group currently resolves to 0+addend, instead of prevailing_symbol_address+addend. We change it from 0+addend to -1. This is not a regression. A relocation referencing a global symbol defined in a non-prevailing comdat group resolves prevailing_symbol_address+addend (ELF rule). This patch does not change the behavior. Though, I can't find any global symbol reference in .debug_* Added a test.

Harbormaster completed remote builds in B60923: Diff 271893.Jun 18 2020, 8:11 PM

dblaikie added inline comments.Jun 18 2020, 10:10 PM

lld/ELF/InputSection.cpp
908	yeah, I doubt it's a huge actual perf concern - but shorter scoped variables, etc, make for easier to understand code
lld/test/ELF/debug-dead-reloc.s
38	Yep, just local symbols - you can test the behavior between bfd/gold/lld with this example: a.cpp: inline void f1() { } void f2() { f1(); } b.cpp: inline void f1() { } void f2(); int main() { f1(); f2(); } $ clang++ f1.cpp f2.cpp -g $ llvm-dwarfdump a.out & you'll see that gold and bfd resolve all the relocations to the same comdat to the same address - though this only happens if the comdats are identical. If they differ (say, if you make f1 do some arithmetic and optimize one but don't optimize the other) then the behavior will match lld - keeping one but pointing only the matching DWARF to that definition, and all the other DWARF will point to the tombstone value (whatever that is in each linker). Anyway - nothing new here/related to this patch, and the lld behavior's probably preferable for a lot of folks - means CUs don't start overlapping unintentionally, etc.

MaskRay marked an inline comment as done.Jun 18 2020, 11:25 PM

MaskRay added inline comments.

lld/test/ELF/debug-dead-reloc.s
38	This was previously discussed on https://bugs.llvm.org/show_bug.cgi?id=37212 I agree with you that it deserves its own email thread. See `PRETEND` and `_bfd_elf_check_kept_section` in binutils-gdb/bfd/elflink.c for GNU ld's behavior. GNU ld finds a replacement if the sizes are equal. Anyway - nothing new here/related to this patch, and the lld behavior's probably preferable for a lot of folks - means CUs don't start overlapping unintentionally, etc. I agree. I think a tombstone value is better than `PRETEND` logic.

Latest changes still LGTM, FWIW.

@grimar @probinson Does the latest version look good to you?

grimar added inline comments.Jun 22 2020, 5:01 AM

lld/ELF/InputSection.cpp

857

use const bool for these 2?

912

Missing an empty line before if.

935

It looks a bit unideal to me, because auto *ds = dyn_cast<Defined>(&sym) is always called,
though it is not always needed.
The same happens with isDebugLocOrRanges ? UINT64_MAX - 1 : UINT64_MAX condition, which
can be caluclated once.
Also isDebugSection() just do an excessive job it seems, because it checks the ".debug_" prefix.
And sym.getOutputSection() checks that sym is Defined internally, so this check is done twice too.

What about doing the following?

template <class ELFT, class RelTy>
void InputSection::relocateNonAlloc(uint8_t *buf, ArrayRef<RelTy> rels) {
  const unsigned bits = sizeof(typename ELFT::uint) * 8;
  const bool isDebug = isDebugSection(*this);
  const uint64_t Tombstone = (name == ".debug_loc" || name == ".debug_ranges")
                                 ? (UINT64_MAX - 1)
                                 : UINT64_MAX;
  ....

    if (sym.isTls() && !Out::tlsPhdr) {
      target->relocateNoSym(bufLoc, type, 0);
      continue;
    }

    if (!isDebug || type != target->symbolicRel) {
      target->relocateNoSym(bufLoc, type,
                            SignExtend64<bits>(sym.getVA(addend)));
      continue;
    }

    auto *ds = dyn_cast<Defined>(&sym);
    if (!ds || !ds->section->repl || !ds->section->repl->getOutputSection() ||
        ds->section->repl != ds->section)
      target->relocateNoSym(bufLoc, type, Tombstone);
    else
      target->relocateNoSym(bufLoc, type,
                            SignExtend64<bits>(sym.getVA(addend)));
  }
}

grimar added inline comments.Jun 22 2020, 5:08 AM

lld/ELF/InputSection.cpp
935	Also isDebugSection() just do an excessive job it seems, because it checks the ".debug_" prefix. By this I meant that probably no reason to have `isDebug &&` part in the condition: isDebug && (name == ".debug_loc" \|\| name == ".debug_ranges") Though it is not the piece to really worry about.

MaskRay marked 3 inline comments as done.Jun 22 2020, 9:21 AM

MaskRay added inline comments.

lld/ELF/InputSection.cpp
935	`isDebugLocOrRanges ? UINT64_MAX - 1 : UINT64_MAX` can easily hoisted by the compiler if it likes. Manually doing this is unnecessary. Though it is not the piece to really worry about. Agree. because auto ds = dyn_cast<Defined>(&sym) is always called, though it is not always needed. The suggested transformation looks a bit unnecessary to me. Most relocations from non-SHF_ALLOC sections are from `.debug_`. The remaining ones are uncommon and/or tiny sections such as __patchable_function_entries, .stack_sizes, .llvm_addrsig, etc. There is very little we can save from the restricting the scope of `dyn_cast`. I want to keep the control flows as is.

Add an empty line. Add const

I've been watching, and have nothing to add to what James or Dave had to say.

Harbormaster failed remote builds in B61255: Diff 272467!Jun 22 2020, 10:45 AM

Closed by commit rGe618ccbf431f: [ELF] Resolve relocations in .debug_* referencing (discarded symbols or ICF… (authored by MaskRay). · Explain WhyJun 23 2020, 11:48 AM

This revision was automatically updated to reflect the committed changes.

Whilst merging this change into our downstream port, our private tests flagged up a behaviour difference between our old behaviour and the version this patch implements. On further investigation, I think it is a bug in the upstream patch rather than our merge or tests.

Consider the ICF case with two duplicate functions:

int foo(int x)
{
	return 42 + x;
}

int bar(int x)
{
	return 42 + x;
}

int main(){
	int x = 0;
	x += foo(x);
	x += bar(x);
	return x;
}

In this case, LLD --icf=all will result in foo and bar being combined, with foo being the "kept" function. Prior to this change, references from debug data to the folded function (i.e. bar) were updated to point at the kept function. This would of course mean there was duplicate debug data for various addresses. In general, this may not have been ideal, but in one case I think it is actually necessary, specifically in the case of .debug_line. If you run gdb on the above example, having linked it using LLD, you can set a breakpoint at lines within bar (e.g. line 8 in my example), and it will be hit both when foo is executed and when bar is executed. This is reasonable behaviour, given the debug data doesn't include call site information the debugger can use here. With this patch, the reference in .debug_line to bar is patched to -1. This means that placing a breakpoint at line 8 will result in a breakpoint at address 0x6 (in my local case), rather than in the middle of foo. This in turn means the breakpoint will never be hit. From a user's point of view, if they don't know where their function is being folded into (which would probably be the norm), the breakpoint not working properly is a serious regression. Of course, if it is placed, the program will jump to a different function than expected (and may even break unexpectedly), but I think personally that this is less confusing, especially as that is the old behaviour anyway.

I think we need to add a special case for patching .debug_line, so that duplicates are not patched to -1, but rather to the symbol they are a duplicate of (dead references should still be patched to -1). This would match our downstream's old behaviour.

In D81784#2116925, @jhenderson wrote:
Whilst merging this change into our downstream port, our private tests flagged up a behaviour difference between our old behaviour and the version this patch implements. On further investigation, I think it is a bug in the upstream patch rather than our merge or tests.

Consider the ICF case with two duplicate functions:
int foo(int x)
{
	return 42 + x;
}

int bar(int x)
{
	return 42 + x;
}

int main(){
	int x = 0;
	x += foo(x);
	x += bar(x);
	return x;
}
In this case, LLD --icf=all will result in foo and bar being combined, with foo being the "kept" function. Prior to this change, references from debug data to the folded function (i.e. bar) were updated to point at the kept function. This would of course mean there was duplicate debug data for various addresses. In general, this may not have been ideal, but in one case I think it is actually necessary, specifically in the case of .debug_line. If you run gdb on the above example, having linked it using LLD, you can set a breakpoint at lines within bar (e.g. line 8 in my example), and it will be hit both when foo is executed and when bar is executed. This is reasonable behaviour, given the debug data doesn't include call site information the debugger can use here. With this patch, the reference in .debug_line to bar is patched to -1. This means that placing a breakpoint at line 8 will result in a breakpoint at address 0x6 (in my local case), rather than in the middle of foo. This in turn means the breakpoint will never be hit. From a user's point of view, if they don't know where their function is being folded into (which would probably be the norm), the breakpoint not working properly is a serious regression. Of course, if it is placed, the program will jump to a different function than expected (and may even break unexpectedly), but I think personally that this is less confusing, especially as that is the old behaviour anyway.

I think we need to add a special case for patching .debug_line, so that duplicates are not patched to -1, but rather to the symbol they are a duplicate of (dead references should still be patched to -1). This would match our downstream's old behaviour.

I have mixed feelings about this - I assume any strategy for this would likely also trigger for the case I've seen more often, which is two inline function definitions that are identical and rather than one DW_TAG_subprogram getting to point to the final copy, both DW_TAG_subprograms in different CUs point to the final copy (if that copy is the same length, I think is the test in other linkers - but maybe it's if their contents are identical)?

Is this a change in lld's behavior? Or has lld always resolved identical-but-deduplicated functions to only the one subprogram, or to all subprograms?

Or is your workaround (the one you have in your fork) /only/ for the line table, but not for DW_AT_ranges on the CU or for the low_pc of the subprogram? That would seem pretty quirky - having addresses covered by the line table that aren't covered by the ranges of the CU? But also if it is applied to teh CU too, then you have CUs with overlapping ranges (@probinson I think expressed some concern about overlapping CUs being undesirable previously)

In D81784#2117592, @dblaikie wrote:

I have mixed feelings about this - I assume any strategy for this would likely also trigger for the case I've seen more often, which is two inline function definitions that are identical and rather than one DW_TAG_subprogram getting to point to the final copy, both DW_TAG_subprograms in different CUs point to the final copy (if that copy is the same length, I think is the test in other linkers - but maybe it's if their contents are identical)?

Is this a change in lld's behavior? Or has lld always resolved identical-but-deduplicated functions to only the one subprogram, or to all subprograms?

I haven't specifically looked at the other debug sections. I believe they previously contained duplicate addresses, but am not 100% sure. In most cases this is bad due to ambiguity reasons, I believe. In our downstream port, we patched to -1/-2 (except for .debug_line).

Or is your workaround (the one you have in your fork) /only/ for the line table, but not for DW_AT_ranges on the CU or for the low_pc of the subprogram? That would seem pretty quirky - having addresses covered by the line table that aren't covered by the ranges of the CU? But also if it is applied to teh CU too, then you have CUs with overlapping ranges (@probinson I think expressed some concern about overlapping CUs being undesirable previously)

Our implementation specifically special-cases behaviour for .debug_line to be different to other debug sections, which used -1/-2 for the address references to de-duplicated code, so .debug_ranges for example didn't have overlapping ranges etc. I don't think it's that quirky - the debug line table is actually independent of other debug data and can usefully exist without .debug_info etc; it doesn't know anything about its corresponding CU (apart from some file paths). That all being said, it's been a fairly recent change in our downstream port. Before that, we just did what this patch does in .debug_line as well as other sections.

MaskRay mentioned this in D82828: [ELF] Don't resolve a relocation in .debug_line referencing an ICF folded symbol to the tombstone value.Jun 29 2020, 5:48 PM

Another of our downstream tests has flagged up another issue with this change, this time with TLS symbols. clang appears to use R_X86_64_DTPOFF64 relocations for such symbols, rather than R_X86_64_64 relocations. This means the condition type == target->symbolicRel is not triggering for them. In our downstream port, we didn't check the relocation type, so the references to the discarded TLS variables was always patched to -1. Meanwhile, old LLD before this change appears to be patching it to 0. New LLD is patching it to something relative to the start of the TLS block (in my case it's 0xffffffffffff8000, but I imagine it depends on layout/base address etc). I think we need to relax the relocation type check, possibly even remove it completely.

Whilst trying to create a small fix locally, I noticed that our downstream test was originally motivated by a R_X86_64_DTPOFF32 relocation rather than a 64-bit one, since the DWARF was using a 4-byte location offset in this case. I guess we need to allow both these relocation types to follow the same code path as R_X86_64_64. At that point, I start to wonder whether we should just allow all relocation types to follow that path...

MaskRay mentioned this in D82899: [ELF] Resolve R_DTPREL in .debug_* referencing discarded symbols to -1.Jun 30 2020, 11:38 AM

In D81784#2122721, @jhenderson wrote:

Whilst trying to create a small fix locally, I noticed that our downstream test was originally motivated by a R_X86_64_DTPOFF32 relocation rather than a 64-bit one, since the DWARF was using a 4-byte location offset in this case. I guess we need to allow both these relocation types to follow the same code path as R_X86_64_64. At that point, I start to wonder whether we should just allow all relocation types to follow that path...

I agree that R_X86_64_DTPOFF{32,64} (R_DTPREL) should use -1 as well, but I don't think R_X86_64_32 should use it (tested in debug-dead-reloc-32.s)

Created D82899.

MaskRay mentioned this in rGe6ad78fe0503: [ELF] Don't resolve a relocation in .debug_line referencing an ICF folded….Jul 1 2020, 2:05 PM

MaskRay mentioned this in rG6fa1343bb3e8: [ELF] Resolve R_DTPREL in .debug_* referencing discarded symbols to -1.Jul 3 2020, 10:12 AM

MaskRay mentioned this in D83957: [lldb/DWARF] Don't get confused by line sequences with tombstone values.Jul 16 2020, 5:41 PM

labath mentioned this in rGf3fab392f574: [lldb/DWARF] Don't get confused by line sequences with tombstone values.Jul 17 2020, 2:52 AM

hans mentioned this in rG884357e301a9: [lldb/DWARF] Don't get confused by line sequences with tombstone values.Jul 17 2020, 4:22 AM

MaskRay mentioned this in D84825: [ELF] Change tombstone values to (.debug_ranges/.debug_loc) 1 and (other .debug_*) 0.Aug 6 2020, 3:10 PM

MaskRay mentioned this in rG004be4037e1e: [ELF] Change tombstone values to (.debug_ranges/.debug_loc) 1 and (other ..Aug 6 2020, 3:30 PM

pzheng added a subscriber: pzheng.Aug 11 2020, 11:28 AM

ikudrin mentioned this in D116946: [ELF] Use tombstone values for discarded symbols in relocatable output.Jan 10 2022, 7:31 AM

ikudrin mentioned this in rGe00ac48df3a6: [ELF] Use tombstone values for discarded symbols in relocatable output.Jan 12 2022, 9:32 PM

Revision Contents

Path

Size

lld/

ELF/

InputSection.cpp

38 lines

test/

ELF/

debug-dead-reloc-32.s

29 lines

debug-dead-reloc-icf.s

24 lines

debug-dead-reloc.s

53 lines

Diff 272467

lld/ELF/InputSection.cpp

Show First 20 Lines • Show All 846 Lines • ▼ Show 20 Lines
// an example. Relocations in non-alloc sections are much easier to		// an example. Relocations in non-alloc sections are much easier to
// handle than in allocated sections because it will never need complex		// handle than in allocated sections because it will never need complex
// treatment such as GOT or PLT (because at runtime no one refers them).		// treatment such as GOT or PLT (because at runtime no one refers them).
// So, we handle relocations for non-alloc sections directly in this		// So, we handle relocations for non-alloc sections directly in this
// function as a performance optimization.		// function as a performance optimization.
template <class ELFT, class RelTy>		template <class ELFT, class RelTy>
void InputSection::relocateNonAlloc(uint8_t *buf, ArrayRef<RelTy> rels) {		void InputSection::relocateNonAlloc(uint8_t *buf, ArrayRef<RelTy> rels) {
const unsigned bits = sizeof(typename ELFT::uint) * 8;		const unsigned bits = sizeof(typename ELFT::uint) * 8;
		const bool isDebug = isDebugSection(*this);
		const bool isDebugLocOrRanges =
		isDebug && (name == ".debug_loc" \|\| name == ".debug_ranges");
		grimarUnsubmitted Done Reply Inline Actions use `const bool` for these 2? grimar: use `const bool` for these 2?

for (const RelTy &rel : rels) {		for (const RelTy &rel : rels) {
RelType type = rel.getType(config->isMips64EL);		RelType type = rel.getType(config->isMips64EL);

// GCC 8.0 or earlier have a bug that they emit R_386_GOTPC relocations		// GCC 8.0 or earlier have a bug that they emit R_386_GOTPC relocations
// against _GLOBAL_OFFSET_TABLE_ for .debug_info. The bug has been fixed		// against _GLOBAL_OFFSET_TABLE_ for .debug_info. The bug has been fixed
// in 2017 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82630), but we		// in 2017 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82630), but we
// need to keep this bug-compatible code for a while.		// need to keep this bug-compatible code for a while.
Show All 34 Lines	if (expr != R_ABS && expr != R_DTPREL && expr != R_RISCV_ADD) {
// address 0. For bug-compatibilty, we accept them with warnings. We		// address 0. For bug-compatibilty, we accept them with warnings. We
// know Steel Bank Common Lisp as of 2018 have this bug.		// know Steel Bank Common Lisp as of 2018 have this bug.
warn(msg);		warn(msg);
target->relocateNoSym(bufLoc, type,		target->relocateNoSym(bufLoc, type,
SignExtend64<bits>(sym.getVA(addend - offset)));		SignExtend64<bits>(sym.getVA(addend - offset)));
continue;		continue;
}		}

if (sym.isTls() && !Out::tlsPhdr)		if (sym.isTls() && !Out::tlsPhdr) {
		dblaikieUnsubmitted Done Reply Inline Actions Probably not a big deal, but curious to do this test when it might not be needed. Ah, partly because fo the if/else if/else structure - if this used early-return instead, perhaps it would be easier to write it like this: if (sym.isTls() && !Out::tlsPhdr) { target->relocateNoSym(bufLoc, type, 0); continue; } if (isDebug && type == target->symbolicRel) { auto ds = dyn_cast<Defined>(&sym); if (!sym.getOutputSection() \|\| (ds && ds->section->repl != ds->section)) { target->relocateNoSym(bufLoc, type, isDebugLocOrRanges ? UINT64_MAX - 1 : UINT64_MAX); continue; } } target->relocateNoSym(bufLoc, type, SignExtend64<bits>(sym.getVA(addend))); dblaikie:* Probably not a big deal, but curious to do this test when it might not be needed. Ah, partly…
		MaskRayAuthorUnsubmitted Done Reply Inline Actions Thanks for the suggestion! I thought about the additional test a normal code path would run. It seems that you're also concerned - so I'll refactor the if else if else structure as you suggested to avoid an unneeded dyn_cast. MaskRay: Thanks for the suggestion! I thought about the additional test a normal code path would run. It…
		dblaikieUnsubmitted Not Done Reply Inline Actions yeah, I doubt it's a huge actual perf concern - but shorter scoped variables, etc, make for easier to understand code dblaikie: yeah, I doubt it's a huge actual perf concern - but shorter scoped variables, etc, make for…
target->relocateNoSym(bufLoc, type, 0);		target->relocateNoSym(bufLoc, type, 0);
else		continue;
		}

		grimarUnsubmitted Done Reply Inline Actions Missing an empty line before `if`. grimar: Missing an empty line before `if`.
		if (isDebug && type == target->symbolicRel) {
		// Resolve relocations in .debug_* referencing (discarded symbols or ICF
		// folded section symbols) to a tombstone value. Resolving to addend is
		// unsatisfactory because the result address range may collide with a
		// valid range of low address, or leave multiple CUs claiming ownership of
		dblaikieUnsubmitted Done Reply Inline Actions "normal behavior" might be overstating it - it was the behavior of lld and is the behavior of gold, but isn't the behavior of bfd.ld. maybe "Other options such as zero (1 for debug_ranges) (bfd.ld) or zero+addend (gold) result in overlaps with valid low-address ranges, or deeper problems terminating debug_loc and debug_range lists early"? dblaikie: "normal behavior" might be overstating it - it was the behavior of lld and is the behavior of…
		MaskRayAuthorUnsubmitted Done Reply Inline Actions I'll just say "Resolving to addend". I may send a patch to GNU ld, and a statement about GNU ld's behavior may be soon outdated. MaskRay: I'll just say "Resolving to addend". I may send a patch to GNU ld, and a statement about GNU…
		// the same range of code, which may confuse consumers.
		//
		// To address the problems, we use -1 as a tombstone value for most
		// .debug_* sections. We have to ignore the addend because we don't want
		// to resolve an address attribute (which may have a non-zero addend) to
		dblaikieUnsubmitted Done Reply Inline Actions I think you can omit the "DW_AT_low_pc (which may have a non-zero addend)" part of this, perhaps? In general we don't want -1+addend to wrap around into a valid address range, no matter which attribute it might be on. dblaikie: I think you can omit the "DW_AT_low_pc (which may have a non-zero addend)" part of this…
		// -1+addend (wrap around to a low address).
		//
		// If the referenced symbol is discarded (made Undefined), or the
		// section defining the referenced symbol is garbage collected,
		// sym.getOutputSection() is nullptr. `ds->section->repl != ds->section`
		// catches the ICF folded case.
		//
		// For pre-DWARF-v5 .debug_loc and .debug_ranges, -1 is a reserved value
		jhendersonUnsubmitted Done Reply Inline Actions Nit: spurious space before the full stop. jhenderson: Nit: spurious space before the full stop.
		MaskRayAuthorUnsubmitted Done Reply Inline Actions I tend to not write `.debug_ranges.` I will add `sections` MaskRay: I tend to not write `.debug_ranges.` I will add `sections`
		dblaikieUnsubmitted Done Reply Inline Actions Might clarify "a pair of 0 (0+addend could cause this when handling an empty function)" (though perhaps that deosn't need to be called out here - since we already motivated this earlier in the comment talking about low or zero addresses - so only the -1 case needs to be justified/explained why it can't use the same solution as all the other .debug_ sections being handled here) Probably "the entries and -1" rather than a comma there - since there's only two things in this list. (& maybe the end of this "Use -2 instead" might not stand on its own as a sentence - "... so -2 is used." dblaikie: Might clarify "a pair of 0 (0+addend could cause this when handling an empty function)" (though…
		// (base address selection entry), so -2 is used.
		auto *ds = dyn_cast<Defined>(&sym);
		avlUnsubmitted Done Reply Inline Actions I think it is better not to do string comparisons for every relocation pointing to deleted code. It is worth to move this name comparisons before the loop. avl: I think it is better not to do string comparisons for every relocation pointing to deleted code.
		MaskRayAuthorUnsubmitted Done Reply Inline Actions I think it is actually more efficient doing string comparisons here. The tombstone value code path is rare. By doing string comparisons here, we don't perform string comparisons in the normal code paths. MaskRay: I think it is actually more efficient doing string comparisons here. The tombstone value code…
		dblaikieUnsubmitted Done Reply Inline Actions But you have to do it repeatedly for every relocation in this section, yes? - and .debug_loc and .debug_ranges may contain many relocations. The test for "is this a debug section" happens unconditionally once, outside the relocation application loop - it seems like this test could/should happen in the same place (& it can be conditional on "is this a debug section": ` bool isDebugLocOrRanges = isDebugLoc && name == ".debug_loc" \|\| name == ".debug_ranges"; ` Seems like it'd be cheap enough (known short length, the string's already hot in the cache, etc). (alternatively the test could be computed lazily and cached - but given "isDebug" isn't done that way I don't think this one should be either) dblaikie: But you have to do it repeatedly for every relocation in this section, yes? - and .debug_loc…
		if (!sym.getOutputSection() \|\| (ds && ds->section->repl != ds->section)) {
target->relocateNoSym(bufLoc, type,		target->relocateNoSym(bufLoc, type,
SignExtend64<bits>(sym.getVA(addend)));		isDebugLocOrRanges ? UINT64_MAX - 1 : UINT64_MAX);
		grimarUnsubmitted Not Done Reply Inline Actions It looks a bit unideal to me, because `auto ds = dyn_cast<Defined>(&sym)` is always called, though it is not always needed. The same happens with `isDebugLocOrRanges ? UINT64_MAX - 1 : UINT64_MAX` condition, which can be caluclated once. Also `isDebugSection()` just do an excessive job it seems, because it checks the ".debug_" prefix. And `sym.getOutputSection()` checks that `sym` is Defined internally, so this check is done twice too. What about doing the following? template <class ELFT, class RelTy> void InputSection::relocateNonAlloc(uint8_t buf, ArrayRef<RelTy> rels) { const unsigned bits = sizeof(typename ELFT::uint) * 8; const bool isDebug = isDebugSection(this); const uint64_t Tombstone = (name == ".debug_loc" \|\| name == ".debug_ranges") ? (UINT64_MAX - 1) : UINT64_MAX; .... if (sym.isTls() && !Out::tlsPhdr) { target->relocateNoSym(bufLoc, type, 0); continue; } if (!isDebug \|\| type != target->symbolicRel) { target->relocateNoSym(bufLoc, type, SignExtend64<bits>(sym.getVA(addend))); continue; } auto ds = dyn_cast<Defined>(&sym); if (!ds \|\| !ds->section->repl \|\| !ds->section->repl->getOutputSection() \|\| ds->section->repl != ds->section) target->relocateNoSym(bufLoc, type, Tombstone); else target->relocateNoSym(bufLoc, type, SignExtend64<bits>(sym.getVA(addend))); } } grimar: It looks a bit unideal to me, because `auto *ds = dyn_cast<Defined>(&sym)` is always called…
		grimarUnsubmitted Not Done Reply Inline Actions Also isDebugSection() just do an excessive job it seems, because it checks the ".debug_" prefix. By this I meant that probably no reason to have `isDebug &&` part in the condition: isDebug && (name == ".debug_loc" \|\| name == ".debug_ranges") Though it is not the piece to really worry about. grimar: > Also isDebugSection() just do an excessive job it seems, because it checks the ".debug_"…
		MaskRayAuthorUnsubmitted Done Reply Inline Actions `isDebugLocOrRanges ? UINT64_MAX - 1 : UINT64_MAX` can easily hoisted by the compiler if it likes. Manually doing this is unnecessary. Though it is not the piece to really worry about. Agree. because auto ds = dyn_cast<Defined>(&sym) is always called, though it is not always needed. The suggested transformation looks a bit unnecessary to me. Most relocations from non-SHF_ALLOC sections are from `.debug_`. The remaining ones are uncommon and/or tiny sections such as __patchable_function_entries, .stack_sizes, .llvm_addrsig, etc. There is very little we can save from the restricting the scope of `dyn_cast`. I want to keep the control flows as is. MaskRay: `isDebugLocOrRanges ? UINT64_MAX - 1 : UINT64_MAX` can easily hoisted by the compiler if it…
		continue;
		}
		}
		target->relocateNoSym(bufLoc, type, SignExtend64<bits>(sym.getVA(addend)));
}		}
}		}

// This is used when '-r' is given.		// This is used when '-r' is given.
// For REL targets, InputSection::copyRelocations() may store artificial		// For REL targets, InputSection::copyRelocations() may store artificial
// relocations aimed to update addends. They are handled in relocateAlloc()		// relocations aimed to update addends. They are handled in relocateAlloc()
// for allocatable sections, and this function does the same for		// for allocatable sections, and this function does the same for
// non-allocatable sections, such as sections with debug information.		// non-allocatable sections, such as sections with debug information.
▲ Show 20 Lines • Show All 485 Lines • Show Last 20 Lines

lld/test/ELF/debug-dead-reloc-32.s

This file was added.

				# REQUIRES: x86
				## Test we resolve symbolic relocations in .debug_* sections to a tombstone
				## value if the referenced symbol is discarded (--gc-sections, non-prevailing
				jhendersonUnsubmitted Done Reply Inline Actions tombstone -> tombstone value Here and in other tests. jhenderson: tombstone -> tombstone value Here and in other tests.
				## section group, SHF_EXCLUDE, /DISCARD/, etc).

				# RUN: llvm-mc -filetype=obj -triple=i386 %s -o %t.o
				# RUN: ld.lld %t.o -o %t
				# RUN: llvm-objdump -s %t \| FileCheck %s

				# CHECK: Contents of section .debug_loc:
				# CHECK-NEXT: 0000 feffffff
				# CHECK-NEXT: Contents of section .debug_ranges:
				# CHECK-NEXT: 0000 feffffff
				# CHECK-NEXT: Contents of section .debug_addr:
				# CHECK-NEXT: 0000 ffffffff

				.section .text.1,"axe"
				.byte 0
				jhendersonUnsubmitted Done Reply Inline Actions Is the 'e' for SHF_EXCLUDE? If so, do we actually want it and --gc-sections? Seems like one or the other would be sufficient to me. jhenderson: Is the 'e' for SHF_EXCLUDE? If so, do we actually want it and --gc-sections? Seems like one or…
				MaskRayAuthorUnsubmitted Done Reply Inline Actions I'll drop --gc-sections. MaskRay: I'll drop --gc-sections.

				## Resolved to UINT32_C(-2), with the addend ignored.
				## UINT32_C(-1) is a reserved value (base address selection entry) which can't be used.
				.section .debug_loc
				.long .text.1+8
				.section .debug_ranges
				.long .text.1+16

				## Resolved to UINT32_C(-1), with the addend ignored.
				.section .debug_addr
				.long .text.1+8

lld/test/ELF/debug-dead-reloc-icf.s

This file was added.

				# REQUIRES: x86
				## Test we resolve symbolic relocations in .debug_* sections to a tombstone
				## value if the referenced section symbol is folded into another section by ICF.
				## Otherwise, we would leave entries in multiple CUs claiming ownership of the
				## same range of code, which can confuse consumers.

				jhendersonUnsubmitted Done Reply Inline Actions One issue that I don't recall whether it was brought up previously: if we resolve one to -1, and the other has no debug data (e.g. they came from two different CUs, only one of which was compiled with -g), we could actually lose debug data for the function completely. I don't know what the right thing to do is in this situation, as I don't know whether it's better than the risk of confusing consumers. I don't think we can reasonably attempt to identify whether there is matching debug data for the kept version after all. jhenderson: One issue that I don't recall whether it was brought up previously: if we resolve one to -1…
				dblaikieUnsubmitted Done Reply Inline Actions Yeah, I'd accept that as an existing problem/not one that I think this tombstone work should necessarily address - the old solution or the new solution would be compatible with a linker making a more nuanced approach to this problem. (but, in that separate discussion: I'm not sure this would be appropriate to fix - fixing this, I think, would mean having debug info influence which version of a function the linker chooses - increasing the risk of heisenbugs) dblaikie: Yeah, I'd accept that as an existing problem/not one that I think this tombstone work should…
				MaskRayAuthorUnsubmitted Done Reply Inline Actions https://reviews.llvm.org/D59553#2083857 My understanding is that: the prevailing definition should have associated debug info. If not, the linker should not do extra work to attach debug info from other copies to the prevailing definition. MaskRay: https://reviews.llvm.org/D59553#2083857 My understanding is that: the prevailing definition…
				jhendersonUnsubmitted Not Done Reply Inline Actions FWIW, I looked at our properietary linker and downstream LLD ports, both of which have the -1 patching behaviour implemented, and we don't handle the "some with, some without" case at all, so ICF folded symbols can theoretically lose debug data, same as is implemented in this patch. So from our point of view, there's no regression, just a continued space from improvement. jhenderson: FWIW, I looked at our properietary linker and downstream LLD ports, both of which have the -1…
				# RUN: llvm-mc -filetype=obj -triple=x86_64 %s -o %t.o
				# RUN: ld.lld --icf=all %t.o -o %t
				# RUN: llvm-objdump -s %t \| FileCheck %s

				# CHECK: Contents of section .debug_info:
				# CHECK-NEXT: 0000 {{[0-9a-f]+}}000 00000000 ffffffff ffffffff

				jhendersonUnsubmitted Done Reply Inline Actions Perhaps it would be better to spell out all the zeros for the first relocation target, as it was a bit confusing seeing one set of 8 followed by a set of 16. jhenderson: Perhaps it would be better to spell out all the zeros for the first relocation target, as it…
				.globl _start
				_start:
				ret

				## .text.1 will be folded by ICF.
				.section .text.1,"ax"
				ret

				.section .debug_info
				.quad .text+8
				.quad .text.1+8

lld/test/ELF/debug-dead-reloc.s

This file was added.

				# REQUIRES: x86
				## Test we resolve symbolic relocations in .debug_* sections to a tombstone
				## value if the referenced symbol is discarded (--gc-sections, non-prevailing
				## section group, SHF_EXCLUDE, /DISCARD/, etc).

				# RUN: echo '.globl _start; _start: call group' \| llvm-mc -filetype=obj -triple=x86_64 - -o %t.o
				# RUN: llvm-mc -filetype=obj -triple=x86_64 %s -o %t1.o
				# RUN: ld.lld --gc-sections %t.o %t1.o %t1.o -o %t
				# RUN: llvm-objdump -s %t \| FileCheck %s

				# CHECK: Contents of section .debug_loc:
				# CHECK-NEXT: 0000 feffffff ffffffff feffffff ffffffff
				# CHECK-NEXT: Contents of section .debug_ranges:
				# CHECK-NEXT: 0000 feffffff ffffffff feffffff ffffffff
				# CHECK-NEXT: Contents of section .debug_addr:
				# CHECK-NEXT: 0000 {{.}}000 00000000 {{.}}000 00000000
				# CHECK-NEXT: 0010 ffffffff ffffffff {{.*}}000 00000000
				# CHECK-NEXT: Contents of section .debug_foo:
				# CHECK-NEXT: 0000 ffffffff ffffffff 08000000 00000000
				# CHECK-NEXT: 0010 ffffffff ffffffff 08000000 00000000

				.section .text.1,"ax"
				.byte 0
				.section .text.2,"axe"
				.byte 0
				.section .text.3,"axG",@progbits,group,comdat
				.globl group
				group:
				.byte 0

				## Resolved to UINT64_C(-2), with the addend ignored.
				## UINT64_C(-1) is a reserved value (base address selection entry) which can't be used.
				.section .debug_loc
				.quad .text.1+8
				.section .debug_ranges
				.quad .text.2+16

				.section .debug_addr
				dblaikieUnsubmitted Not Done Reply Inline Actions So even if the comdat group is identical, this will resolve only one of them & use -1 for the other? I think that's a regression/separate change that probably should have a sepraate discussion compared to the -1 rather than 0+addend change. Seems there could be some value in resolving all the function descriptions to the same code where possible - but I also see the problem with having CUs ranges overlap (which I think @probinson is especially concerned about). So the change may well be merited, but I think deserves a separate review/discussion. (eg: currently... oh, bfd and gold currently do the "point all the DWARF to the one definition" but it seems lld doesn't do that and, as you say, already does zero (well, zero+addend) out the duplicate references to the same deduplicated comdat) Fair enough then... curious, but not a change in behavior, so guess we slipped that in somewhere along the way & hasn't broken anything that I've heard. dblaikie: So even if the comdat group is identical, this will resolve only one of them & use -1 for the…
				MaskRayAuthorUnsubmitted Done Reply Inline Actions A relocation referencing a local symbol defined in a non-prevailing comdat group currently resolves to 0+addend, instead of prevailing_symbol_address+addend. We change it from 0+addend to -1. This is not a regression. A relocation referencing a global symbol defined in a non-prevailing comdat group resolves prevailing_symbol_address+addend (ELF rule). This patch does not change the behavior. Though, I can't find any global symbol reference in .debug_* Added a test. MaskRay: A relocation referencing a local symbol defined in a non-prevailing comdat group currently…
				dblaikieUnsubmitted Not Done Reply Inline Actions Yep, just local symbols - you can test the behavior between bfd/gold/lld with this example: a.cpp: inline void f1() { } void f2() { f1(); } b.cpp: inline void f1() { } void f2(); int main() { f1(); f2(); } $ clang++ f1.cpp f2.cpp -g $ llvm-dwarfdump a.out & you'll see that gold and bfd resolve all the relocations to the same comdat to the same address - though this only happens if the comdats are identical. If they differ (say, if you make f1 do some arithmetic and optimize one but don't optimize the other) then the behavior will match lld - keeping one but pointing only the matching DWARF to that definition, and all the other DWARF will point to the tombstone value (whatever that is in each linker). Anyway - nothing new here/related to this patch, and the lld behavior's probably preferable for a lot of folks - means CUs don't start overlapping unintentionally, etc. dblaikie: Yep, just local symbols - you can test the behavior between bfd/gold/lld with this example: a.
				MaskRayAuthorUnsubmitted Done Reply Inline Actions This was previously discussed on https://bugs.llvm.org/show_bug.cgi?id=37212 I agree with you that it deserves its own email thread. See `PRETEND` and `_bfd_elf_check_kept_section` in binutils-gdb/bfd/elflink.c for GNU ld's behavior. GNU ld finds a replacement if the sizes are equal. Anyway - nothing new here/related to this patch, and the lld behavior's probably preferable for a lot of folks - means CUs don't start overlapping unintentionally, etc. I agree. I think a tombstone value is better than `PRETEND` logic. MaskRay: This was previously discussed on https://bugs.llvm.org/show_bug.cgi?id=37212 I agree with you…
				## .text.3 is a local symbol. The symbol defined in a non-prevailing group is
				## discarded. Resolved to UINT64_C(-1).
				.quad .text.3+24
				## group is a non-local symbol. The relocation from the second %t1.o gets
				## resolved to the prevailing copy.
				.quad group+32

				.section .debug_foo
				.quad .text.1+8

				## We only deal with DW_FORM_addr. Don't special case short-range absolute
				## relocations. Treat them like regular absolute relocations referencing
				## discarded symbols, which are resolved to the addend.
				.long .text.1+8
				.long 0

This is an archive of the discontinued LLVM Phabricator instance.

[ELF] Resolve relocations in .debug_* referencing (discarded symbols or ICF folded section symbols) to tombstone valuesClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 272467

lld/ELF/InputSection.cpp

lld/test/ELF/debug-dead-reloc-32.s

lld/test/ELF/debug-dead-reloc-icf.s

lld/test/ELF/debug-dead-reloc.s

[ELF] Resolve relocations in .debug_* referencing (discarded symbols or ICF folded section symbols) to tombstone values
ClosedPublic