Page MenuHomePhabricator

[ELF] Don't resolve a relocation in .debug_line referencing an ICF folded symbol to the tombstone value
ClosedPublic

Authored by MaskRay on Mon, Jun 29, 5:48 PM.

Details

Summary

After D81784, we resolve a relocation in .debug_* referencing an ICF folded
section symbol to a tombstone value.

Doing this for .debug_line has a problem (https://reviews.llvm.org/D81784#2116925 ):
.debug_line may describe folded lines as having addresses UINT64_MAX or
some wraparound small addresses.

int foo(int x) {
  return x; // line 2
}

int bar(int x) {
  return x; // line 6
}
Address            Line   Column File   ISA Discriminator Flags
------------------ ------ ------ ------ --- ------------- -------------
0x00000000002016c0      1      0      1   0             0  is_stmt
0x00000000002016c7      2      9      1   0             0  is_stmt
prologue_end
0x00000000002016ca      2      2      1   0             0
0x00000000002016cc      2      2      1   0             0  end_sequence
// UINT64_MAX and wraparound small addresses
0xffffffffffffffff      5      0      1   0             0  is_stmt
0x0000000000000006      6      9      1   0             0  is_stmt
prologue_end
0x0000000000000009      6      2      1   0             0
0x000000000000000b      6      2      1   0             0  end_sequence
0x00000000002016d0      9      0      1   0             0  is_stmt
0x00000000002016df     10      6      1   0             0  is_stmt prologue_end
0x00000000002016e6     11     11      1   0             0  is_stmt
...

These entries can confuse debuggers:

gdb before 2020-07-01 (binutils-gdb a8caed5d7faa639a1e6769eba551d15d8ddd9510 )
(can't continue due to a breakpoint in an invalid region of memory):

Warning:
Cannot insert breakpoint 1.
Cannot access memory at address 0x6

lldb (breakpoint has no effect):

(lldb) b 6
Breakpoint 1: no locations (pending).
WARNING:  Unable to resolve breakpoint to any actual locations.

This patch special cases .debug_line to not use the tombstone value,
restoring the previous behavior: .debug_line will have entries with the
same addresses (ICF) but different line numbers. A breakpoint on line 2
or 6 will trigger on both functions.

Diff Detail

Event Timeline

MaskRay created this revision.Mon, Jun 29, 5:48 PM

Doing this for .debug_line has a problem (https://reviews.llvm.org/D81784#2116925 )

I think that describes a different problem from the one you're trying to address here - that one is more (from my understanding) a quality-of-life thing. Where it might be nicer if you could set breakpoints on ICF eliminated functions, right?

.debug_line may describe folded lines as having addresses UINT64_MAX or some wraparound small addresses.

^ that's kind of the goal of the feature, though. So that seems like "working as intended" (& describes the behavior of this feature in other sections too - and for gc'd (rather than ICF'd) code in debug_line still, right?)

These entries can confuse debuggers

Any idea why 0 entries don't confuse debuggers in the same way? Or did lld not use zero previously for ICF functions? (instead did it point all relocations to the ICF to the one copy - rather than making one real and the rest zero (well, zero+addend)?)

(I'm sort of on the fence about this - seems an unfortunate extra special case (especially applying it only to .debug_line - while the .debug_info, etc, still don't see these duplicates - which isn't necessarily better because overlapping debug_ranges might confuse some consumers, etc... ) where we've already got a few)

I think this is the right fix, but slightly for the wrong reasons. You're describing the problem as the confusing addresses for debuggers, whereas I'm saying it's the inability to set a breakpoint on a folded function's line that is the problem. The two are related but not quite the same. In the "confusing addresses" case, one way to fix it would be to update debuggers to recognise the tombstones and special-case them. That would be fine, except that it doesn't allow a user to set a breakpoint on lives that were folded in (i.e. my point). Hence, it's important to characterise this as "I want to be able to set breakpoints on folded-in functions".

I've suggested comment changes inline to reflect this.

By the way, what about discarded COMDATs? They're similar in concept, but implemented differently, at least at the front-end, so might deserve their own test case?

lld/ELF/InputSection.cpp
930–933

I recommend:

"However, resolving a relocation in .debug_line to -1 would stop debugger users from setting breakpoints on the folded-in function, so exclude .debug_line."

lld/test/ELF/debug-dead-reloc-icf.s
29–32

# -> ##

Similar to the code comment, I recommend:

".debug_line contributions associated with folded functions will describe different lines to the canonical funcion. Leaving a tombstone value would prevent users setting breakpoints in the folded-in function. Instead resolve the relocation to the folded .text.1 to .text."

MaskRay marked 2 inline comments as done.Tue, Jun 30, 8:55 AM

Doing this for .debug_line has a problem (https://reviews.llvm.org/D81784#2116925 )

I think that describes a different problem from the one you're trying to address here - that one is more (from my understanding) a quality-of-life thing. Where it might be nicer if you could set breakpoints on ICF eliminated functions, right?

.debug_line may describe folded lines as having addresses UINT64_MAX or some wraparound small addresses.

^ that's kind of the goal of the feature, though. So that seems like "working as intended" (& describes the behavior of this feature in other sections too - and for gc'd (rather than ICF'd) code in debug_line still, right?)

As @jhenderson commented, this is more about debugging experience, not a "confused debugger" problem.

These entries can confuse debuggers

Any idea why 0 entries don't confuse debuggers in the same way? Or did lld not use zero previously for ICF functions? (instead did it point all relocations to the ICF to the one copy - rather than making one real and the rest zero (well, zero+addend)?)

Previously, a relocation referencing a folded section symbol (e.g. .text.1) resolves to .text, not 0. So .debug_line has duplicate address entries but for different lines.
This patch restores the behavior.

(I'm sort of on the fence about this - seems an unfortunate extra special case (especially applying it only to .debug_line - while the .debug_info, etc, still don't see these duplicates - which isn't necessarily better because overlapping debug_ranges might confuse some consumers, etc... ) where we've already got a few)

MaskRay updated this revision to Diff 274508.Tue, Jun 30, 8:56 AM

Address comments

MaskRay added a comment.EditedTue, Jun 30, 9:00 AM

I think this is the right fix, but slightly for the wrong reasons. You're describing the problem as the confusing addresses for debuggers, whereas I'm saying it's the inability to set a breakpoint on a folded function's line that is the problem. The two are related but not quite the same. In the "confusing addresses" case, one way to fix it would be to update debuggers to recognise the tombstones and special-case them. That would be fine, except that it doesn't allow a user to set a breakpoint on lives that were folded in (i.e. my point). Hence, it's important to characterise this as "I want to be able to set breakpoints on folded-in functions".

I've suggested comment changes inline to reflect this.

Thanks. Adopted.

By the way, what about discarded COMDATs? They're similar in concept, but implemented differently, at least at the front-end, so might deserve their own test case?

Dead relocations in .debug_line due to non-prevailing COMDATs is not special cased like ICF. COMDATs usually have the same file:line (in a common .h) There is no loss of debugging experience.

In few cases two translation units may define a inline function in different places. In that case, we never have any GNU ld PRETEND logic (https://sourceware.org/pipermail/binutils/2020-June/111784.html ). Setting a breakpoint on a non-prevailing function never works. This patch doesn't change anything. This should already been clear in debug-dead-reloc-32.s, so I don't think an additional test for .debug_line is necessary.

jhenderson accepted this revision.Wed, Jul 1, 12:45 AM

Two nits, otherwise LGTM, but might be worth getting others to confirm they're happy with the behaviour too.

lld/ELF/InputSection.cpp
932

Nit: So -> so

lld/test/ELF/debug-dead-reloc-icf.s
32

Nit: missing trailing full stop.

This revision is now accepted and ready to land.Wed, Jul 1, 12:45 AM

I agree that being able to set a breakpoint on each different source of an ICF is useful, and that the COMDAT case typically has the same source and can use the tombstone.

MaskRay marked 2 inline comments as done.Wed, Jul 1, 11:05 AM
MaskRay added inline comments.
lld/test/ELF/debug-dead-reloc-icf.s
32

I usually tend to omit the trailing period if it would become part of a section name as it causes some confusion (for example .text.hot. can be produced by clang PGO after D79600)...

MaskRay updated this revision to Diff 274877.Wed, Jul 1, 11:05 AM

Fix a comment

@dblaikie Dave, are you ok with this change as well?

dblaikie accepted this revision.Wed, Jul 1, 11:50 AM

@dblaikie Dave, are you ok with this change as well?

Sure! Not the biggest fan, personally, of these layered special cases - but equally if Sony's been shipping with it/demonstrates practical usefulness, etc, I won't stand in the way & can appreciate the practical benefits.

MaskRay edited the summary of this revision. (Show Details)Wed, Jul 1, 1:29 PM

@dblaikie Dave, are you ok with this change as well?

Sure! Not the biggest fan, personally, of these layered special cases - but equally if Sony's been shipping with it/demonstrates practical usefulness, etc, I won't stand in the way & can appreciate the practical benefits.

For COMDAT, there is a prevailing (canonical) group. Usually the file:line is the same. For ICF, every function should be equal. If a function can't be set breakpoint just because the linker does not select it as the canonical one, the lost debugging experience looks genuine to me. Just committed https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=a8caed5d7faa639a1e6769eba551d15d8ddd9510 (woohoo! my first gdb patch) before it, the experience was even less pleasant because gdb would keep warning when you hit continue

Warning:
Cannot insert breakpoint 1.
Cannot access memory at address 0x6
This revision was automatically updated to reflect the committed changes.

Hello, this seems to increase runtime of some dwarf processing tools for us by several orders of magnitude (from "terminate in a few minutes" to "don't know how long they terminate; not before our timeouts"). https://bugs.chromium.org/p/chromium/issues/detail?id=1102223#c5 has repro steps.

Can we revert this and analyze async?

Hello, this seems to increase runtime of some dwarf processing tools for us by several orders of magnitude (from "terminate in a few minutes" to "don't know how long they terminate; not before our timeouts"). https://bugs.chromium.org/p/chromium/issues/detail?id=1102223#c5 has repro steps.

Can we revert this and analyze async?

I don't think this is LLD's problem. I'd rather add an option --dead-reloc-addend='.debug_*=0xffffffffffffffff', probably temporarily.