This is an archive of the discontinued LLVM Phabricator instance.

[lldb] fix #61942, discard "empty" ranges in DWARF to better handle gcc binary
Needs ReviewPublic

Authored by jwnhy on Apr 5 2023, 5:41 AM.

Details

Summary

This patch resolves an issue that lldb not be able to
match the correct range of certain address.

This issue caused by other compilers like gcc generates
"empty ranges" like [address, address) in the DIE. These
"empty ranges" cause lldb matches address with these
ranges unintentionally and causes unpredictable result.
(In #61942, a variable dispearred due to this issue)

This patch resolves this issue by discarding these "empty
ranges" during the parsing of debugging information.

Another approach in fixing this might be changing the
logic of "FindEntryThatContains" and let it try harder
if met "empty ranges".

Diff Detail

Event Timeline

jwnhy created this revision.Apr 5 2023, 5:41 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 5 2023, 5:41 AM
jwnhy requested review of this revision.Apr 5 2023, 5:41 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 5 2023, 5:41 AM

I'm not familiar with this code, but the issue as explained I think I understand.

It seems like you're checking for empty ranges in two places, what does each one do?

Is there anything else these ranges indicate or do we think it is simply a missed optimisation in the debug info producer? If they do sometimes mean something, then perhaps improving the searching of the ranges is preferable to discarding the empty ones.
(though if all tests pass with this patch then these empty ranges can't be that important to us can they)

lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
1318 ↗(On Diff #511063)

Combine this into one if. Also I'd put the Valid check first, just seems like the right order for it (not that it matters really).

The change looks fine and regardless of whether this makes sense or even complies with the standard, we should be resilient against it. I would like to see a test though.

jwnhy added a comment.Apr 5 2023, 6:35 PM

I'm not familiar with this code, but the issue as explained I think I understand.

It seems like you're checking for empty ranges in two places, what does each one do?

Is there anything else these ranges indicate or do we think it is simply a missed optimisation in the debug info producer? If they do sometimes mean something, then perhaps improving the searching of the ranges is preferable to discarding the empty ones.
(though if all tests pass with this patch then these empty ranges can't be that important to us can they)

Thanks for the comments, I didn't improve the search because that searching logic seems being used everywhere and not just restricted to filter through these "address ranges", I am a bit afraid of changing that also changes something subtle.
though it seems not making much sense to returning a "empty range" as searching results.

jwnhy added a comment.Apr 5 2023, 6:38 PM

The change looks fine and regardless of whether this makes sense or even complies with the standard, we should be resilient against it. I would like to see a test though.

Thanks a lot for the comment, I am new to lldb community, and got one thing a bit silly to ask.
Up to now, a few patches I submitted is kind of "depending on the compiler-generated" binary?
What am I supposed to do to ensure the compiler generates these "easy-to-fault" binaries in the test?

Like in this one, not every compiler will generate "empty ranges", and in the other one that is "DW_OP_div"...

I would like to add tests once I figure these out...

The change looks fine and regardless of whether this makes sense or even complies with the standard, we should be resilient against it. I would like to see a test though.

Thanks a lot for the comment, I am new to lldb community, and got one thing a bit silly to ask.
Up to now, a few patches I submitted is kind of "depending on the compiler-generated" binary?
What am I supposed to do to ensure the compiler generates these "easy-to-fault" binaries in the test?

Like in this one, not every compiler will generate "empty ranges", and in the other one that is "DW_OP_div"...

Yes, this would require a test that checks in what the compiler generates (as opposed to the majority of our "API tests" that build test cases from source). This can either be an assembly file (something like dwarf5-implicit-const.s) or a YAML file created with obj2yaml (something like section-overlap.yaml).

labath added a subscriber: labath.Apr 11 2023, 1:10 AM

The change looks fine and regardless of whether this makes sense or even complies with the standard, we should be resilient against it. I would like to see a test though.

Thanks a lot for the comment, I am new to lldb community, and got one thing a bit silly to ask.
Up to now, a few patches I submitted is kind of "depending on the compiler-generated" binary?
What am I supposed to do to ensure the compiler generates these "easy-to-fault" binaries in the test?

Like in this one, not every compiler will generate "empty ranges", and in the other one that is "DW_OP_div"...

Yes, this would require a test that checks in what the compiler generates (as opposed to the majority of our "API tests" that build test cases from source). This can either be an assembly file (something like dwarf5-implicit-const.s) or a YAML file created with obj2yaml (something like section-overlap.yaml).

You may be able to adapt/extend lldb/test/Shell/SymbolFile/DWARF/x86/debug_ranges.s

jwnhy updated this revision to Diff 512362.Apr 11 2023, 1:31 AM

Updated the patch to remove unncessary checks and added a testcase.

jwnhy added a comment.Apr 11 2023, 1:43 AM

The change looks fine and regardless of whether this makes sense or even complies with the standard, we should be resilient against it. I would like to see a test though.

It seems that DWARFv5 indeed specifies how to deal with these empty ranges

A bounded range entry whose beginning and ending address offsets are equal
(including zero) indicates an empty range and may be ignored.

Also, I kind of searched through the codebase, and noticed that there are multiple places like this, and they adapt a similar approach discarding those empty ranges.
e.g. in DWARFDebugRanges.cpp

  // Filter out empty ranges
  if (begin < end)
    range_list.Append(DWARFRangeList::Entry(begin + base_addr, end - begin));
}