This is an archive of the discontinued LLVM Phabricator instance.

[LLDB][NativePDB] Add support for S_DEFRANGE_REGISTER and S_DEFRANGE_SUBFIELD_REGISTER
ClosedPublic

Authored by zequanwu on Feb 10 2022, 6:48 PM.

Details

Summary

This adds support for S_DEFRANGE_REGISTER and S_DEFRANGE_SUBFIELD_REGISTER that describe S_LOCAL.
S_DEFRANGE_REGISTER describes a list of ranges where the value of S_LOCAL (a local variable) is in a register.
S_DEFRANGE_SUBFIELD_REGISTER describes a list of ranges where the value of a subfield is in a register.

204 | S_LOCAL [size = 12] `s`
      type=0x2478 (S), flags = none
216 | S_DEFRANGE_SUBFIELD_REGISTER [size = 20]
      register = ECX, may have no name = true, offset in parent = 0
      range = [0001:0003,+14), gaps = []
236 | S_DEFRANGE_SUBFIELD_REGISTER [size = 20]
      register = DL, may have no name = true, offset in parent = 4
      range = [0001:0006,+11), gaps = []

Here s has two fields x(int) and y(char). The two S_DEFRANGE_SUBFIELD_REGISTER say that the value of s.x is in ecx in range [0001:0003,+14) and the value of s.y is in dl in range [0001:0006,+11).

For S_DEFRANGE_SUBFIELD_REGISTER, if we want to describe the location precisely, we need to use dwarf location lists to describe the range and construct a dwarf debug location table, which I'm not sure if it's difficult.
So, I choose to let the range list be the overlap ranges among all range lists. For example, we have 3 S_DEFRANGE_SUBFIELD_REGISTER describing 3 different fields in a structure and they have following range lists. The final range list will be [1,2), [4, 6).

  1. [0, 3), [4, 8)
  2. [0, 2), [3, 6)
  3. [1, 2), [4, 10)

Diff Detail

Event Timeline

zequanwu requested review of this revision.Feb 10 2022, 6:48 PM
zequanwu created this revision.

I'd say that the mere fact that we need to construct dwarf expressions shows that we have the abstractions set up incorrectly. While we're constructing a single expression, it is sort of ok-ish (dwarf expressions are kind of used in contexts outside the dwarf standard anyway), but I definitely wouldn't want to start constructing dwarf location lists in order to represent a variable location. We should have some kind of a generic "variable location provider" interface, and individual symbol files can implement that interface using any means at their disposal.

I know this patch doesn't do that, but I had to say that anyway. Now for the patch...

I haven't looked at it in detail (I don't know much about PDBs anyway), but it seems relatively straight-forward. What worries me is the test. Specifically its dependence on clang's optimizer producing a certain kind of debug info. In dwarf, we've recently started using tests which spell out the debug info (more or less) directly -- either in assembly, or through yaml. I'm hoping that something similar can be done with pdbs as well. The advantage is that you get more targeted tests that can run anyway (you don't need to actually run the binary -- a lot of these things can be tested through image lookup -- you can look at some of the debug_loc dwarf tests for inspiration). That's important as it gives non-windows developers (i.e. pretty much everyone) some assurance that they haven't broken pdb support. And this is also pretty much the only way to test the handling of weird/malformed debug info.

Having an end-to-end test which runs the binary and checks that we're actually able to print the variable values is great, but ideally, it shouldn't be our first or only line of defence.

rnk added a comment.Feb 14 2022, 10:52 AM

Regarding the use of DWARF expressions, I agree with @labath, it would be better to have some abstraction here, but for now it seems OK to do this. By the way, LLVM's codeview emission logic pattern matches partial DWARF expressions (from DIExpression) to produce these S_DEFRANGE_* records. Essentially, we are almost round-tripping the variable location info from DWARF through codeview and back (not entirely accurate, but close enough to be amusing).

Regarding the testing strategy, I think assembly tests are the way to go here. We have a high-level .cv_def_range assembler directive to produce these records from labels in the code. That should make it reasonably easy to construct readable tests by modifying LLVM's assembly output. We have spent some effort to make it readable, although it's always possible to do more.

zequanwu updated this revision to Diff 408677.Feb 14 2022, 6:14 PM

Changed test case to assembly, but we still need to test it on Windows. LLDB commands like image lookup -a ... -v will give incorrect register numbers in variables' locations due to https://github.com/llvm/llvm-project/issues/53575.

LLDB commands like image lookup -a ... -v will give incorrect register numbers in variables' locations due to https://github.com/llvm/llvm-project/issues/53575.

While definitely not ideal, I don't think that's a showstopper. You can put a note next to the expectations that the register names are incorrect and a link to the bug. The general expression structure should still be correct, and that's the most interesting aspect of this patch.

And yeah, we should definitely do something about that bug.

lldb/source/Plugins/SymbolFile/NativePDB/DWARFLocationExpression.h
44

ArrayRef

lldb/source/Plugins/SymbolFile/NativePDB/PdbUtil.cpp
733–764

Is this computing the intersection of the two range lists? If so, we can put it into a separate utility function (with its own tests) and then use it here.

zequanwu marked an inline comment as done.EditedFeb 16 2022, 11:25 AM

LLDB commands like image lookup -a ... -v will give incorrect register numbers in variables' locations due to https://github.com/llvm/llvm-project/issues/53575.

While definitely not ideal, I don't think that's a showstopper. You can put a note next to the expectations that the register names are incorrect and a link to the bug. The general expression structure should still be correct, and that's the most interesting aspect of this patch.

When I try to use image lookup -a ... -v to print variables, all variables are printed even if they are unavailable at that address, sent D119963 to fix it.

zequanwu updated this revision to Diff 409452.Feb 16 2022, 4:44 PM

Address comments and update test to use image lookup -a ... -v -R for testing.

zequanwu marked an inline comment as done.Feb 16 2022, 4:44 PM
zequanwu updated this revision to Diff 412555.Mar 2 2022, 3:07 PM

Rebase to dump single range of variables.

Herald added a project: Restricted Project. · View Herald TranscriptMar 2 2022, 3:07 PM
labath added inline comments.Mar 3 2022, 2:10 AM
lldb/source/Plugins/SymbolFile/NativePDB/PdbUtil.cpp
609

I don't think this is correct. Imagine that one list contains the range [0, 10), and the other [4, 7) -- the overlap (intersection) should be [4,7), but I think this algorithm will produce an empty list.

For a correct implementation, I think we'll need to do a parallel iteration over the two lists (incrementing the one whose entry begins first, or something).

That's why I wanted to put this function into RangeMap.h -- then we could write dedicated unit tests for it.

zequanwu updated this revision to Diff 412880.Mar 3 2022, 5:41 PM

Add unit test for RangeVector::GetOverlaps.

zequanwu marked an inline comment as done.Mar 3 2022, 5:42 PM

Just a couple of comments. Otherwise, I think this is fine.

lldb/source/Plugins/SymbolFile/NativePDB/DWARFLocationExpression.h
15

Not needed anymore, I guess?

lldb/unittests/Utility/RangeMapTest.cpp
45–75

I think this'd be easier to read if you grouped it by "test case":
e.g.

// same range
V1.Append(0, 1);
V2.Append(0, 1);
Expected.Append(0, 1);

// another case
...

(I'm not sure why this was not sent with the rest)

lldb/source/Core/Address.cpp
732–734 ↗(On Diff #412880)

This function is the only caller of LocationIsValidForAddress. I think it'd be better to move this check there.

zequanwu updated this revision to Diff 414210.Mar 9 2022, 1:57 PM
zequanwu marked 3 inline comments as done.
zequanwu edited the summary of this revision. (Show Details)

Update.

Herald added a project: Restricted Project. · View Herald TranscriptMar 9 2022, 1:57 PM
labath accepted this revision.Mar 10 2022, 7:05 AM
labath added inline comments.
lldb/source/Symbol/Variable.cpp
251

add if (!valid_in_scope_range) return false; and undo the other changes

This revision is now accepted and ready to land.Mar 10 2022, 7:05 AM
zequanwu updated this revision to Diff 414464.Mar 10 2022, 12:39 PM

address comment

This revision was landed with ongoing or failed builds.Mar 10 2022, 12:40 PM
This revision was automatically updated to reflect the committed changes.