This is an archive of the discontinued LLVM Phabricator instance.

[DWARF] Fix a bug in line info handling
ClosedPublic

Authored by loladiro on May 21 2015, 4:24 PM.

Details

Summary

This fixes a bug in the line info handling in the dwarf code, based on a
problem I when implementing RelocVisitor support for MachO.
Since addr+size will give the first address past the end of the function,
we need to back up one line table entry. Fix this by looking up the end_addr-1,
which is the last address in the range. Also factor out the common functionality
into a separate function.
This comes up on MachO much more than on ELF, since MachO
doesn't store the symbol size separately, hence making
said situation always occur.

Diff Detail

Repository
rL LLVM

Event Timeline

loladiro updated this revision to Diff 26285.May 21 2015, 4:24 PM
loladiro retitled this revision from to [DWARF] Fix a bug in line info handling.
loladiro updated this object.
loladiro edited the test plan for this revision. (Show Details)
loladiro added reviewers: echristo, friss, dblaikie.
loladiro set the repository for this revision to rL LLVM.
loladiro added a subscriber: Unknown Object (MLST).
dblaikie edited edge metadata.May 21 2015, 4:32 PM

+Alexey

samsonov added inline comments.May 21 2015, 4:59 PM
lib/DebugInfo/DWARF/DWARFDebugLine.cpp
608

I don't understand this part. Looks like row_pos should be the first row that contains "address". It means that the lookup here should be similar to the one done in lookupAddress function (of course, I'd prefer this code to be shared).

628

Can you simply find the number of row that describes the address "end_addr - 1"?

Yes, I tried, just using address - 1 for the second patch and it passes the testcase, so I'm happy. I don't know enough about this code to decide which is better, so just let me know and I'll update the code accordingly.

samsonov edited edge metadata.May 21 2015, 11:13 PM

Personally, I would prefer to have a single function that would take an address, and a Sequence, describing this address, and return a row in that sequence, describing this address. Then we can use this function in several places (lookupAddress, lookupAddressRange).

loladiro updated this revision to Diff 26339.May 22 2015, 12:26 PM
loladiro updated this object.
loladiro edited edge metadata.

Updated to factor out common code into a separate function.

samsonov added inline comments.May 22 2015, 1:03 PM
lib/DebugInfo/DWARF/DWARFDebugLine.cpp
525

consider moving this constant to LineTable struct in the header (and rename to smth. like UnknownRowIndex)

571

Looks like this condition should be checked in findRowInSequence function above.

603

I'd rather not provide default value for variables, if we're unconditionally initializing them later.

608

Just a ternary operator?

uint32_t first_row_index = (seq_pos == start_pos) ? findRowInSeq(cur_seq, address) : cur_seq.FirstRowIndex;
614

Fun fact: if you add the call to Sequence::containsPC() at the beginning of findRowInSequence, as suggested above, you won't need this special case:

uint32_t last_row_index = findRowInSequnce(cur_seq, end_addr - 1);
if (last_row_index == UnknownRowIndex)
  last_row_index = cur_seq.LastRowIndex - 1;
621

assert that both first_row_index and last_row_index are valid? Or return false in this case.

loladiro added inline comments.May 22 2015, 1:10 PM
lib/DebugInfo/DWARF/DWARFDebugLine.cpp
525

Will do.

571

Sure, it wasn't there before below, so I didn't pull it, but seems reasonable.

603

We're not, this also handles the case when seq_pos == start_pos is false.

608

But that wouldn't capture the case where findRowInSeq returns an UnknownRowIndex.

614

Sounds good.

loladiro updated this revision to Diff 26348.May 22 2015, 2:13 PM

Updated again. I also noticed that we were duplicating the last row before (the end_sequence one, which is not in general valid - I believe), so I fixed the test as well - please make sure that seems correct.

samsonov accepted this revision.May 22 2015, 3:48 PM
samsonov edited edge metadata.

I don't know about the data needed for llvm-rtdyld, so can't comment about the changes to test/DebugInfo/debuglineinfo.test. Otherwise, the patch now looks fine to me (modulo comments below).

include/llvm/DebugInfo/DWARF/DWARFDebugLine.h
208

Fix a comment about lookupAddress return value to mention this. Also, lookupAddress should probably be private, the public interface is getFileLineInfoForAddress. Looks like LineTable is now a class, not a struct.

lib/DebugInfo/DWARF/DWARFDebugLine.cpp
538

Run patch through clang-format

601

Hm, should this be a const reference instead?

This revision is now accepted and ready to land.May 22 2015, 3:48 PM

llvm-rtdyld just uses getLineInfoForAddressRange and dumps that out, so the change in the test case directly corresponds to a change in the return value of that function. However, I do believe that this is correct now, because the extra line was marked with end_sequence, which the DWARF standard says is A boolean indicating that the current address is that of the first byte after the end of a sequence, so I believe it shouldn't be included if you're asking for just the rows that cover the function. Will do the stylistic changes you note.

loladiro updated this revision to Diff 26367.May 22 2015, 9:52 PM
loladiro edited edge metadata.

Run everything through clang-format, also change cur_seq to a const reference as suggested. I didn't adjust the visibility of LineTable, as that seems to be a separate commit, preferably done by somebody who knows which part of the interface should be public.

Note that this is currently uncommitable, because llvm-rtdyld can no longer figure out the size of MachO symbols due to rL238028. Maybe @rafael can help fix llvm-rtdyld.

This revision was automatically updated to reflect the committed changes.