This fixes a bug in the line info handling in the dwarf code, based on a
problem I when implementing RelocVisitor support for MachO.
Since addr+size will give the first address past the end of the function,
we need to back up one line table entry. Fix this by looking up the end_addr-1,
which is the last address in the range. Also factor out the common functionality
into a separate function.
This comes up on MachO much more than on ELF, since MachO
doesn't store the symbol size separately, hence making
said situation always occur.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
lib/DebugInfo/DWARF/DWARFDebugLine.cpp | ||
---|---|---|
608 ↗ | (On Diff #26285) | I don't understand this part. Looks like row_pos should be the first row that contains "address". It means that the lookup here should be similar to the one done in lookupAddress function (of course, I'd prefer this code to be shared). |
628 ↗ | (On Diff #26285) | Can you simply find the number of row that describes the address "end_addr - 1"? |
Yes, I tried, just using address - 1 for the second patch and it passes the testcase, so I'm happy. I don't know enough about this code to decide which is better, so just let me know and I'll update the code accordingly.
Personally, I would prefer to have a single function that would take an address, and a Sequence, describing this address, and return a row in that sequence, describing this address. Then we can use this function in several places (lookupAddress, lookupAddressRange).
lib/DebugInfo/DWARF/DWARFDebugLine.cpp | ||
---|---|---|
525 ↗ | (On Diff #26339) | consider moving this constant to LineTable struct in the header (and rename to smth. like UnknownRowIndex) |
571 ↗ | (On Diff #26339) | Looks like this condition should be checked in findRowInSequence function above. |
603 ↗ | (On Diff #26339) | I'd rather not provide default value for variables, if we're unconditionally initializing them later. |
608 ↗ | (On Diff #26339) | Just a ternary operator? uint32_t first_row_index = (seq_pos == start_pos) ? findRowInSeq(cur_seq, address) : cur_seq.FirstRowIndex; |
614 ↗ | (On Diff #26339) | Fun fact: if you add the call to Sequence::containsPC() at the beginning of findRowInSequence, as suggested above, you won't need this special case: uint32_t last_row_index = findRowInSequnce(cur_seq, end_addr - 1); if (last_row_index == UnknownRowIndex) last_row_index = cur_seq.LastRowIndex - 1; |
621 ↗ | (On Diff #26339) | assert that both first_row_index and last_row_index are valid? Or return false in this case. |
lib/DebugInfo/DWARF/DWARFDebugLine.cpp | ||
---|---|---|
525 ↗ | (On Diff #26339) | Will do. |
571 ↗ | (On Diff #26339) | Sure, it wasn't there before below, so I didn't pull it, but seems reasonable. |
603 ↗ | (On Diff #26339) | We're not, this also handles the case when seq_pos == start_pos is false. |
608 ↗ | (On Diff #26339) | But that wouldn't capture the case where findRowInSeq returns an UnknownRowIndex. |
614 ↗ | (On Diff #26339) | Sounds good. |
Updated again. I also noticed that we were duplicating the last row before (the end_sequence one, which is not in general valid - I believe), so I fixed the test as well - please make sure that seems correct.
I don't know about the data needed for llvm-rtdyld, so can't comment about the changes to test/DebugInfo/debuglineinfo.test. Otherwise, the patch now looks fine to me (modulo comments below).
include/llvm/DebugInfo/DWARF/DWARFDebugLine.h | ||
---|---|---|
208 ↗ | (On Diff #26348) | Fix a comment about lookupAddress return value to mention this. Also, lookupAddress should probably be private, the public interface is getFileLineInfoForAddress. Looks like LineTable is now a class, not a struct. |
lib/DebugInfo/DWARF/DWARFDebugLine.cpp | ||
538 ↗ | (On Diff #26348) | Run patch through clang-format |
601 ↗ | (On Diff #26348) | Hm, should this be a const reference instead? |
llvm-rtdyld just uses getLineInfoForAddressRange and dumps that out, so the change in the test case directly corresponds to a change in the return value of that function. However, I do believe that this is correct now, because the extra line was marked with end_sequence, which the DWARF standard says is A boolean indicating that the current address is that of the first byte after the end of a sequence, so I believe it shouldn't be included if you're asking for just the rows that cover the function. Will do the stylistic changes you note.
Run everything through clang-format, also change cur_seq to a const reference as suggested. I didn't adjust the visibility of LineTable, as that seems to be a separate commit, preferably done by somebody who knows which part of the interface should be public.