This is an archive of the discontinued LLVM Phabricator instance.

[llvm] Skip over empty line table entries.
ClosedPublic

Authored by mtrofin on Mar 4 2019, 10:07 PM.

Event Timeline

mtrofin created this revision.Mar 4 2019, 10:07 PM
mtrofin updated this revision to Diff 189278.Mar 4 2019, 10:37 PM
  • Fix search.

Might be worth a separate/targeted test with some of the interesting edge cases? (like empty range at the end of the sequence, etc)

llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp
875–876

I'd probably flip these conditions - even though there is the last row (so RowPos + 1 is valid here), I think it'd read a bit better, maybe?

Also - what happens if the empty range is at the end (eg: LastRowIndex address == second-to-LastRowIndex address)? Maybe that's already handled by the "containsPC" test above?

@JDevlieghere since this is in general code: Could this affect dsymutil at all?

@JDevlieghere since this is in general code: Could this affect dsymutil at all?

llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp
871

Doesn't the DWARF spec require all PC values in the line table to be strictly monotonically increasing? How is it possible to have more than one entry at the same address?

probinson added inline comments.Mar 6 2019, 1:59 PM
llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp
871

The DWARF spec's model is based on increasing PC values, however the encoding allows advancing PC by 0. So it's entirely possible to have a line table that (for example) advances line by 2 and PC by 0.

dblaikie added inline comments.
llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp
871

I couldn't find specific wording about that - but please do take a look, quite possible I missed it.

LLVM's been generating line tables like this for a while (the test updates show evidence of that) & it might be justifiable to fix this code even if it's technically invalid DWARF, given how long we've been generating it. (to my mind, it is wasted DWARF - it doesn't describe any location, so I think it should be correct to remove it if we wanted to, but not a high priority bug (@echristo reckons there might be consumers using these empty ranges to find the start of a function or the like))

probinson added inline comments.Mar 6 2019, 2:27 PM
llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp
871

The non-normative text on the first couple pages of section 6.2 describe the line table as notionally "a large matrix, with one row for each instruction" (p.149 of DWARF 5). However, the normative text describes how special opcodes are formed (section 6.2.5.1, pp. 160-162) without requiring "operation advance" to be nonzero. This permits a special opcode to increment the line number without changing the (address, op_index) tuple.
It's skating on thin ice because DWARF doesn't explicitly give a meaning to this, but the encoding allows it.

aprantl added inline comments.Mar 6 2019, 2:36 PM
llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp
871

That's quite interesting for a completely unrelated reason: If the linetable encoding actually supports more than one entry per PC, we could change the implementation of getMergedLocation to emit both locations and then teach the debugger to disambiguate the locations based on DW_AT_call_* information or something.

it might be justifiable to fix this code even if it's technically invalid DWARF, given how long we've been generating it.

Agreed.

dblaikie added inline comments.Mar 6 2019, 2:42 PM
llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp
871

I don't know/think I'd say "the linetable encoding actually supports more than one entry per PC"

I think each pair of addresses in the line table is the half-open range [X, Y), so if you have two lines after each other you have [X, X) - I would argue that is the empty range (it certainly is with C++ iterators, for instance), not X. So it doesn't describe any addresses at all.

probinson added inline comments.Mar 6 2019, 2:52 PM
llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp
871

Not so fast... We don't encode the line table until deep in the bowels of the assembler. getMergedLocation is working out what to attach to an IR instruction, which doesn't know how to attach more than one location. Teach it that first, and getMergedLocation can return you a list.
It will confuse the heck out of consumers that aren't expecting it, and this will be especially weird for profilers.
I think this particular dark corner of the spec might be exploitable, but it won't be smooth going.

As far as the current patch is concerned, I am uncertain. Having addr2line do something consistent seems likely to do more good than harm, but I can imagine this could cause some confusion in other clients.

probinson added inline comments.Mar 6 2019, 2:59 PM
llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp
871

Treating the zero-increment case as an empty range has some appeal. However, it can be confusing. You tell the debugger to plant a breakpoint on foo(), which the debugger will tell you is on line 12; and yet when you hit said breakpoint, you're told that address is on line 15, because things got folded and optimized and the instruction range for lines 12-14 has gone to nothing.
Maybe we hand-wave that away as "what do you expect from optimized code" but it's still not a great debugging experience.

dblaikie added inline comments.Mar 6 2019, 3:11 PM
llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp
871

Treating the zero-increment case as an empty range has some appeal. However, it can be confusing. You tell the debugger to plant a breakpoint on foo(), which the debugger will tell you is on line 12; and yet when you hit said breakpoint, you're told that address is on line 15, because things got folded and optimized and the instruction range for lines 12-14 has gone to nothing.
Maybe we hand-wave that away as "what do you expect from optimized code" but it's still not a great debugging experience.

I'm not sure I follow this - the line table maps instructions to source locations. If there are no instructions there's nothing to map - I'm pretty sure LLVM's weird zero-length line table range is because of this (set the location, emit no instructions, set the location, emit insntructions - so you end up with the two entries and no locations between them). So DWARF's representation here is pretty clear - if nothing in your function causes instructions until 10 lines in - that's where you first break when you enter the function.

If you say "break foo" - the debugger can tell you, "foo is declared at line 3 (from the DW_TAG_subprogram) and the first instruction is on line 10, which is where you'll break".

Pretty sure GCC doesn't do anything like this, and gas... hmm, nope, actually GCC and gas both do do this thing, producing two line entries at the start of functions that have nothing in the prologue.

All the more reason to support parsing them and interpreting them the same way addr2line does - even if we did decide to change Clang's behavior not to generate these.

I'm not sure I follow this - the line table maps instructions to source locations. If there are no instructions there's nothing to map

That's the intent. However, in practice it maps instruction *addresses* to source locations. Let's say you have a table that decodes to

0x0100  line 12
0x0100  line 14
0x0104  line 15

Go through looking for address 0x0100, you get an exact match that returns line 12 (why bother looking ahead to see whether the interval is empty?).
But look for address 0x0101 and you get line 14.

I'm arguing myself into agreeing with you that the extra work to look for a non-empty interval is probably a good thing; however this is something that each consumer will have to understand to do for itself.

I'm not sure I follow this - the line table maps instructions to source locations. If there are no instructions there's nothing to map

That's the intent. However, in practice it maps instruction *addresses* to source locations.

Well, in practice it's unspecified what it means - so we can look at this and come up with different interpretations. Mine is to interpret these as half open intervals - in which case [100, 100) is empty and you keep searching, then you find [100, 104) and that contains the address you're looking for.

Let's say you have a table that decodes to

0x0100  line 12
0x0100  line 14
0x0104  line 15

Go through looking for address 0x0100, you get an exact match that returns line 12 (why bother looking ahead to see whether the interval is empty?).
But look for address 0x0101 and you get line 14.

I'm arguing myself into agreeing with you that the extra work to look for a non-empty interval is probably a good thing; however this is something that each consumer will have to understand to do for itself.

@aprantl This doesn't seem to affect dsymutil. I compared clang output with and without this patch just to be sure.

So just for the record, I'm fine with making this kind of change in case this got lost in the noise :-)

I'm not sure I follow this - the line table maps instructions to source locations. If there are no instructions there's nothing to map

That's the intent. However, in practice it maps instruction *addresses* to source locations.

Well, in practice it's unspecified what it means - so we can look at this and come up with different interpretations. Mine is to interpret these as half open intervals - in which case [100, 100) is empty and you keep searching, then you find [100, 104) and that contains the address you're looking for.

Understood, but if you're walking through the line-number program looking for address 100, and you find it as an exact match, on what grounds do you require the consumer to continue looking? The consumer does not know the interval is empty unless they keep looking past the exact match. The line-number program opcodes do not tell you about ranges, they only tell you about address-to-source mappings for individual addresses.

Now, if the consumer is fully parsing the line table, and (one assumes) converting to some other internal representation, then they'll find an empty range, and have to figure out what to do with it. In that case, is throwing away the empty range really always the right thing to do? Ignoring them for address-to-source mapping seems quite reasonable, but ignoring them for source-to-address mapping ("break on line 12") seems like unnecessarily throwing information away.

I'm not sure I follow this - the line table maps instructions to source locations. If there are no instructions there's nothing to map

That's the intent. However, in practice it maps instruction *addresses* to source locations.

Well, in practice it's unspecified what it means - so we can look at this and come up with different interpretations. Mine is to interpret these as half open intervals - in which case [100, 100) is empty and you keep searching, then you find [100, 104) and that contains the address you're looking for.

Understood, but if you're walking through the line-number program looking for address 100, and you find it as an exact match, on what grounds do you require the consumer to continue looking?

I don't require it - this is in unspecified territory. My interpretation is that the line table describes regions of instructions with half-open ranges.

The consumer does not know the interval is empty unless they keep looking past the exact match. The line-number program opcodes do not tell you about ranges, they only tell you about address-to-source mappings for individual addresses.

My understanding is that the source location applies to all instructions between two entries. (hence why there's an "end of sequence" entry - which describes the one-past-the-end address, not because it's saying anything about the source at that address (the range is open at that end), but because it's terminating the previous range)

If you were searching for 101 you'd keep going from 100 to find the 104 and then conclude that the entry at 100 applies to 100, 101 (what you care about), 102, and 103 (not 99, nor 104).

I'm suggesting it could come out quite naturally from an implementation scanning forward and looking at ranges, not to special case the exact match of 100 - and to scan for ranges, and test inclusion within those ranges.

Now, if the consumer is fully parsing the line table, and (one assumes) converting to some other internal representation, then they'll find an empty range, and have to figure out what to do with it. In that case, is throwing away the empty range really always the right thing to do?

My interpretation (not backed up by the spec - this is unspecified) is that yes, throwing it away is accurate, because the line table represents an address to source mapping and this entry describes a mapping from zero instructions.

Ignoring them for address-to-source mapping seems quite reasonable, but ignoring them for source-to-address mapping ("break on line 12") seems like unnecessarily throwing information away.

My interpretation is that there is no information here. LLVM said "the following instructions are at line 12" then emitted no instructions and went on to say "the following instructions are at line 15" and then emitted instructions - I think it'd be inappropriate (I can't say incorrect, because the spec doesn't say) to conclude that the instructions on line 15 have any relationship to line 12.

mtrofin updated this revision to Diff 190145.Mar 11 2019, 1:16 PM
  • Fix search.
  • Empty ranges testcase
dblaikie added inline comments.Mar 11 2019, 3:35 PM
llvm/test/tools/llvm-symbolizer/only-empty-ranges.s
5–10

Could you include a comment describing (potentially quoting from llvm-dwarfdump output, stripping extraneous columns, line table header (but keep the line table column titles), etc) what the line table looks like & where the empty range ambiguities arise?

mtrofin updated this revision to Diff 190196.Mar 11 2019, 6:44 PM
  • improved test
dblaikie accepted this revision.Mar 12 2019, 8:15 AM

Great - thanks for your patience!

This revision is now accepted and ready to land.Mar 12 2019, 8:15 AM
This revision was automatically updated to reflect the committed changes.
This revision is now accepted and ready to land.Mar 12 2019, 5:50 PM

Before the patch:

=31747==ERROR: AddressSanitizer: stack-use-after-scope on address 0x007ffd652170 at pc 0x000135f69ae4 bp 0x007ffd6520a0 sp 0x007ffd652098
READ of size 4 at 0x007ffd652170 thread T0
    #0 0x135f69ae0 in main::$_0::operator()() const /usr/local/google/home/vitalybuka/src/bbot/llvm/projects/compiler-rt/test/asan/TestCases/use-after-scope-capture.cc:11:14
    #1 0x135f69a90 in __invoke<(lambda at /usr/local/google/home/vitalybuka/src/bbot/llvm/projects/compiler-rt/test/asan/TestCases/use-after-scope-capture.cc:10:9) &> /usr/local/google/home/vitalybuka/src/bbot/android_ndk/standalone-aarch64/lib/gcc/aarch64-linux-android/4.9.x/../../../../include/c++/4.9.x/type_traits:4323:1
    #2 0x135f69a90 in int std::__ndk1::__invoke_void_return_wrapper<int>::__call<main::$_0&>(main::$_0&) /usr/local/google/home/vitalybuka/src/bbot/android_ndk/standalone-aarch64/lib/gcc/aarch64-linux-android/4.9.x/../../../../include/c++/4.9.x/__functional_base:318
    #3 0x135f699b8 in std::__ndk1::__function::__func<main::$_0, std::__ndk1::allocator<main::$_0>, int ()>::operator()() /usr/local/google/home/vitalybuka/src/bbot/android_ndk/standalone-aarch64/lib/gcc/aarch64-linux-android/4.9.x/../../../../include/c++/4.9.x/functional:1562:12
    #4 0x135f69b7c in std::__ndk1::function<int ()>::operator()() const /usr/local/google/home/vitalybuka/src/bbot/android_ndk/standalone-aarch64/lib/gcc/aarch64-linux-android/4.9.x/../../../../include/c++/4.9.x/functional:1924:12
    #5 0x135f693b0 in main /usr/local/google/home/vitalybuka/src/bbot/llvm/projects/compiler-rt/test/asan/TestCases/use-after-scope-capture.cc:16:10
    #6 0x7f836b88ec in __libc_init (/system/lib64/libc.so+0x1b8ec)
    #7 0x135f6926c in do_arm64_start (/data/local/tmp/Output/usr/local/google/home/vitalybuka/src/bbot/compiler_rt_build_android_aarch64/test/asan/AARCH64AndroidConfig/TestCases/Output/use-after-scope-capture.cc.tmp+0x126c)

After this patch:

=================================================================
==29475==ERROR: AddressSanitizer: stack-use-after-scope on address 0x007fdb2866d0 at pc 0x000102b50ae4 bp 0x007fdb286600 sp 0x007fdb2865f8
READ of size 4 at 0x007fdb2866d0 thread T0
    #0 0x102b50ae0 in main::$_0::operator()() const /usr/local/google/home/vitalybuka/src/bbot/android_ndk/standalone-aarch64/lib/gcc/aarch64-linux-android/4.9.x/../../../../include/c++/4.9.x/memory:2062
    #1 0x102b50a90 in __invoke<(lambda at /usr/local/google/home/vitalybuka/src/bbot/llvm/projects/compiler-rt/test/asan/TestCases/use-after-scope-capture.cc:10:9) &> /usr/local/google/home/vitalybuka/src/bbot/android_ndk/standalone-aarch64/lib/gcc/aarch64-linux-android/4.9.x/../../../../include/c++/4.9.x/type_traits:4323:1
    #2 0x102b50a90 in int std::__ndk1::__invoke_void_return_wrapper<int>::__call<main::$_0&>(main::$_0&) /usr/local/google/home/vitalybuka/src/bbot/android_ndk/standalone-aarch64/lib/gcc/aarch64-linux-android/4.9.x/../../../../include/c++/4.9.x/__functional_base:318
    #3 0x102b509b8 in std::__ndk1::__function::__func<main::$_0, std::__ndk1::allocator<main::$_0>, int ()>::operator()() /usr/local/google/home/vitalybuka/src/bbot/android_ndk/standalone-aarch64/lib/gcc/aarch64-linux-android/4.9.x/../../../../include/c++/4.9.x/functional:1562:12
    #4 0x102b50b7c in std::__ndk1::function<int ()>::operator()() const /usr/local/google/home/vitalybuka/src/bbot/android_ndk/standalone-aarch64/lib/gcc/aarch64-linux-android/4.9.x/../../../../include/c++/4.9.x/functional:1924:12
    #5 0x102b503b0 in main /usr/local/google/home/vitalybuka/src/bbot/llvm/projects/compiler-rt/test/asan/TestCases/use-after-scope-capture.cc:16:10
    #6 0x7fa5d658ec in __libc_init (/system/lib64/libc.so+0x1b8ec)
    #7 0x102b5026c in do_arm64_start (/data/local/tmp/Output/usr/local/google/home/vitalybuka/src/bbot/compiler_rt_build_android_aarch64/test/asan/AARCH64AndroidConfig/TestCases/Output/use-after-scope-capture.cc.tmp+0x126c)

Address 0x007fdb2866d0 is located in stack of thread T0 at offset 112 in frame
    #0 0x102b50294 in main /usr/local/google/home/vitalybuka/src/bbot/llvm/projects/compiler-rt/test/asan/TestCases/use-after-scope-capture.cc:6

FYI: @eugenis (sanitizer-build-cop)

ormris removed a subscriber: ormris.Mar 12 2019, 6:13 PM

Reverted in r356001.

mtrofin updated this revision to Diff 190554.Mar 13 2019, 6:41 PM
  • Revert "Revert "[llvm] Skip over empty line table entries.""
  • Handle cases when query address is between ranges.

Hmm, trying to stare at this function it's confusing me a fair bit. I'd expect this to be more obvious/legible than it is, but it's possible I'm just misunderstanding.

For instance, I don't know how this bit of code:

if (RowPos->Address.Address > Address.Address) {

Ever happens - if the initial condition in the function (Seq.containsPC(Address)) is satisfied.

Similarly I think:

while (RowPos + 1 < LastRow && RowPos->Address.Address == (RowPos + 1)->Address.Address) {

I /think/ this loop should never exit through the first condition, again, if Seq.containtsPC(Address) is true - since we searched for an address that is within the sequence, we can't have been sorting for the address in the last row (because that address isn't in the sequence - because it's half open (exclusive of the last element)).

I hope this can all be simplified a bit. I'm thinking something like...

if (!containsPC)
  return Unknown
RowPos = lower_bound
if (RowPos->Address != Address)
  --RowPos; //no further bounds checking needed - could have some asserts (because if the sequence contains the address, and the row we find is > the address, it must have an earlier row
while (next(RowPos)->Address == RowPos->Address) //again, no bounds checking required, RowPos can't be the last row here for similar reasons
  ++RowPos
return Seq.FirstRowIndex + (RowPos - FirstRow);

Does this seem OK? Does it not account for some cases?

llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp
890–891

I'd still be inclined to describe this as "a zero-length address range" or similar, rather than two entries for the same address - though I realize that gets into the weird semantic debate murky waters so maybe it's not worth worrying about.

892

'wrte'? Is that a typo of 'wrt'?

894

'empty ranges' perhaps? (since each row itself doesn't represent a thing that is empty/non-empty - it's pairs of rows that define ranges and those ranges can be empty)

896

s/having had/having/ I think reads more smoothly?

llvm/test/tools/llvm-symbolizer/only-empty-ranges.s
23

line /2/, col 12 I think? (

26–27

Not only the closest with a lower address, but the one that describes that range.

eg: 0x4 or 0x5 shouldn't be described by line 3, column 3 - they aren't described by this line table at all, even though it's the "closest line with a lower address".

30–31

"the last one" sounds like "the last one of the empty ranges" which isn't quite right - we want to ignore all the empty ranges and pick the range that covers 0x3 (necessarily a non-empty one).

mtrofin updated this revision to Diff 190754.Mar 14 2019, 4:49 PM
mtrofin marked 8 inline comments as done.

Simplify code

dblaikie accepted this revision.Mar 14 2019, 5:40 PM

Thanks a bunch - looks good to me.

I take it the added test coverage covers the case that was broken the firstn time this was committed? But were you also able to reproduce that original failure locally with compiler-rt or wherever it was? & have you verified that scenario (check-compiler-rt, or, again, whatever it was - I'm just guessing roughly from the emails I saw) is completely passing now with this change? (in case there were other "interesting" situations that scenario trips over)

Thanks a bunch - looks good to me.

I take it the added test coverage covers the case that was broken the firstn time this was committed? But were you also able to reproduce that original failure locally with compiler-rt or wherever it was? & have you verified that scenario (check-compiler-rt, or, again, whatever it was - I'm just guessing roughly from the emails I saw) is completely passing now with this change? (in case there were other "interesting" situations that scenario trips over)

Vitaly helped me with getting the aarch64 binary involved in the test built locally (thanks again!), and from there, I was able to identify this issue by inspecting the bot report and symbols in the local binary. I don't have an arm/ppc 'test rig' so I didn't do a real, full repro (meaning, running the binary with asan enabled, etc). The arm/ppc failures appeared to come from the same root cause (this one), based on the error. I feel comfortable submitting this at this point.

llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp
890–891

I'd have to tie it back to the fact that there are successive entries with the same address, which explains the following loop. Leaving it as-is.

This revision was automatically updated to reflect the committed changes.