This is an archive of the discontinued LLVM Phabricator instance.

[NativePDB] Add the ability to display global variables
ClosedPublic

Authored by zturner on Oct 25 2018, 1:58 PM.

Details

Summary

LLDB has the ability to display global variables, even without a running process, via the target variable command. This is because global variables are linker initialized, so their values are embedded directly into the executables. This gives us great power for testing native PDB functionality in a cross-platform manner, because we don't actually need a running process. We can just create a target using an EXE file, and display global variables. And global variables can have arbitrarily complex types, so in theory we can fully exercise the type system, record layout, and data formatters for native PDB files and PE/COFF executables on any host platform, as long as our type does not require a dynamic initializer.

This patch adds basic support for finding variables by name, and adds an exhaustive test for fundamental data types and pointers / references to fundamental data types.

Subsequent patches will extend this to typedefs, classes, pointers to functions, and other cases.

Diff Detail

Repository
rL LLVM

Event Timeline

zturner created this revision.Oct 25 2018, 1:58 PM

Is there anything PDB specific about the test you've added? If not, it might be good to use this as a generic SymbolFile test.

which generic SymbolFile test do you mean? The lit ones are the only ones that are set up to run in this particular manner (run lines, check lines, etc), and currently we don't have a way to run different / multiple command line invocations. I came up with this test in order to test the new changes I introduced in SymbolFileNativePdb.cpp that are also in this patch, so I definitely want to make sure all of that code still gets exercised with whatever test strategy I end up using.

I guess the way I see it, there's two interesting things this test could exercise. 1) the debug info, and 2) the formatters. If we want to test the formatters, we could probably brainstorm a way to do it generically with 1 test (or set of tests) that omits the SymbolFile from the picture entirely and is generic enough that it should be applicable to any platform/compiler/host. If we want to test the SymbolFile plugin though, then it may need to be specific to the debug info type.

I think what you're getting at though is that we probably need some notion of "debug info variants" for these lit tests like we have in the dotest suite. I actually had an idea for something that might work like that a while back, which I described here: https://reviews.llvm.org/D52618#1252906

But the idea would basically be that instead of hardcoding a command line like I've done with clang-cl /Z7 etc etc we could write something more generic that would evaluate to different (or perhaps even multiple) things, so we could run it different ways.

Well, what's really going on is that I'm not familiar enough with lit to know that it doesn't have the ability to run different commands to produce the input file... But as you guessed, my point is that you have written a bunch of tests that would be valuable to test against any symfile format, so it is a shame to only run them against one format. OTOH, if that's not possible right now, I'm content to wait for some enhancements to lit that make it possible.

Well, what's really going on is that I'm not familiar enough with lit to know that it doesn't have the ability to run different commands to produce the input file... But as you guessed, my point is that you have written a bunch of tests that would be valuable to test against any symfile format, so it is a shame to only run them against one format. OTOH, if that's not possible right now, I'm content to wait for some enhancements to lit that make it possible.

I can maybe hardcode multiple runlines, one for each possible target triple. But what's really going on is that I'm not familiar enough with other platforms to know how to generate their binaries :)

But I'll actually give it a try. Just to explain what's going on here, I've got these two lines:

// RUN: clang-cl /Z7 /GS- /GR- /c /Fo%t.obj -- %s 
// RUN: lld-link /DEBUG /nodefaultlib /entry:main /OUT:%t.exe /PDB:%t.pdb -- %t.obj

Those two run the compiler and linker and write the output to a temporary file, but don't actually check anything yet. The next two lines (which are actually one command broken with a continuation character)

// RUN: env LLDB_USE_NATIVE_PDB_READER=1 lldb -f %t.exe -s \
// RUN:     %p/Inputs/globals-fundamental.lldbinit | FileCheck %s

runs lldb and pipes the result to FileCheck, using this file as the check file.

So what I could *try* doing is having multiple compiler / linker invocations. One that produces a Windows executable / PDB file. One that produces a Linux executable / DWARF. One that produces some kind of OSX binary (I think lld doesn't work with MachO yet, so I'm not sure about this one), writing the linker outputs to different files each time.

Then we could maybe run lldb 3 times, once against each input, but using the same check file each time. I'll give it a try and see what happens.

Could you also use Vedant's new FileCheck dotest test class? That should allow you to write the tests exactly as you are, but use the dotest mechanism to build and run the example.

lemo accepted this revision.Oct 25 2018, 3:45 PM

looks good. a few comments inline.

lldb/source/Plugins/SymbolFile/NativePDB/PdbSymUid.h
46 ↗(On Diff #171189)

30 + 1 != 32 - what's going on?

135 ↗(On Diff #171189)

= {}; ? (to avoid uninitialized bits)

lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp
892 ↗(On Diff #171189)

assert that section > 0 ? (as precondition)

913 ↗(On Diff #171189)

comment explaining the - 1 ?

936 ↗(On Diff #171189)

pls explicitly initialize all the locals

1406 ↗(On Diff #171189)

type_sp->get() seems cleaner / more readable

This revision is now accepted and ready to land.Oct 25 2018, 3:45 PM

Could you also use Vedant's new FileCheck dotest test class? That should allow you to write the tests exactly as you are, but use the dotest mechanism to build and run the example.

Adding Vedant here. My understanding is that the work he did there is like the inverse of what I'm doing. It allows you to use FileCheck from inside of dotest tests, but I was trying to bypass dotest here. I don't necessarily think "dotest for all things" should be an explicit goal (i actually think long term we should move away from it, but that's for another day). Aside from that though, I don't think this particular test needs any of the functionality that dotest offers. It offers building in multiple configurations, but we can get that from this test by just specifying mulitple run lines (I'm testing this out locally and it seems like I can get it to work). Eventually, using some kind of configuration / builder type system like I brainstormed in the review thread I linked previously, I think we could have all the advantages of dotest's builder even with this style of test.

FWIW, I was also under the impression that Vedant's solution with FileCheck in dotest was only intended as sort of an immediate solution to a problem, but not necessary the long term desired end-state. (I could be wrong about this though).

vsk added a comment.Oct 25 2018, 4:19 PM

Could you also use Vedant's new FileCheck dotest test class? That should allow you to write the tests exactly as you are, but use the dotest mechanism to build and run the example.

Adding Vedant here. My understanding is that the work he did there is like the inverse of what I'm doing. It allows you to use FileCheck from inside of dotest tests, but I was trying to bypass dotest here. I don't necessarily think "dotest for all things" should be an explicit goal (i actually think long term we should move away from it, but that's for another day). Aside from that though, I don't think this particular test needs any of the functionality that dotest offers. It offers building in multiple configurations, but we can get that from this test by just specifying mulitple run lines (I'm testing this out locally and it seems like I can get it to work). Eventually, using some kind of configuration / builder type system like I brainstormed in the review thread I linked previously, I think we could have all the advantages of dotest's builder even with this style of test.

FWIW, I was also under the impression that Vedant's solution with FileCheck in dotest was only intended as sort of an immediate solution to a problem, but not necessary the long term desired end-state. (I could be wrong about this though).

The tests as-written seem to exercise the new functionality. I think it'd be fine to use the FileCheck-in-{dotest,lldbinline} support if you need to set a breakpoint, run a command, and then validate its output (though it looks like you don't)

In D53731#1276743, @vsk wrote:

Could you also use Vedant's new FileCheck dotest test class? That should allow you to write the tests exactly as you are, but use the dotest mechanism to build and run the example.

Adding Vedant here. My understanding is that the work he did there is like the inverse of what I'm doing. It allows you to use FileCheck from inside of dotest tests, but I was trying to bypass dotest here. I don't necessarily think "dotest for all things" should be an explicit goal (i actually think long term we should move away from it, but that's for another day). Aside from that though, I don't think this particular test needs any of the functionality that dotest offers. It offers building in multiple configurations, but we can get that from this test by just specifying mulitple run lines (I'm testing this out locally and it seems like I can get it to work). Eventually, using some kind of configuration / builder type system like I brainstormed in the review thread I linked previously, I think we could have all the advantages of dotest's builder even with this style of test.

FWIW, I was also under the impression that Vedant's solution with FileCheck in dotest was only intended as sort of an immediate solution to a problem, but not necessary the long term desired end-state. (I could be wrong about this though).

The tests as-written seem to exercise the new functionality. I think it'd be fine to use the FileCheck-in-{dotest,lldbinline} support if you need to set a breakpoint, run a command, and then validate its output (though it looks like you don't)

Yea, once I need to actually run a process I expect I'll need to start using it. For the time being, I'm trying see how much ground I can cover without a process. The reasoning is two-fold.

First, it means the tests can run anywhere. If there's no process, my test coverage isn't limited to a specific host OS.

Second, from a layering perspective, if I know that there's a ton of test coverage for "everything that happens before a process is spawned", then once you get to the point that you are spawning process, it limits the scope of where you have to search when something does go wrong.

So that's my thinking.

dotest tests don't require a process. Presumably dotest knows how to build windows targeted PDB debug flavor files (to go along with dwarf/dsym/etc.). So it would be straightforward to make a test that had your input sources, built and made a target out of it and then poked at static variables and their types. That would straightaway run with all the different symbol file formats we support.

That was why using Vedant's FileCheck thing made sense to me. You would use that to specify the test cases, since you like that way of writing tests and anyway already have them written out in that form, but use the dotest machinery to build it for whatever symfile format and target architecture/s the person who was running the test dialed up. But if you are interesting in also getting this to work with the straight FileCheck style test, your time is your own...

BTW, I would use dotest tests specifically for the kind of thing you are doing here because then you can test against the SBType and SBTypeMembers from the debug info you've ingested, which would give you bit and byte offsets and sizes for free. But if your differing tastes end up getting you to add that info to "type lookup" - which we really should do - then I guess we win either way...

dotest tests don't require a process. Presumably dotest knows how to build windows targeted PDB debug flavor files (to go along with dwarf/dsym/etc.). So it would be straightforward to make a test that had your input sources, built and made a target out of it and then poked at static variables and their types. That would straightaway run with all the different symbol file formats we support.

That was why using Vedant's FileCheck thing made sense to me. You would use that to specify the test cases, since you like that way of writing tests and anyway already have them written out in that form, but use the dotest machinery to build it for whatever symfile format and target architecture/s the person who was running the test dialed up. But if you are interesting in also getting this to work with the straight FileCheck style test, your time is your own...

BTW, I would use dotest tests specifically for the kind of thing you are doing here because then you can test against the SBType and SBTypeMembers from the debug info you've ingested, which would give you bit and byte offsets and sizes for free. But if your differing tastes end up getting you to add that info to "type lookup" - which we really should do - then I guess we win either way...

Yea, so one of the commands I'm used to on the Windows command line debugger WinDbg is the dt (dump type) command. It's basically the equivalent of type lookup - more powerful in some ways, less in others . The output format looks like this:

0:000> dt _EXCEPTION_RECORD
ntdll!_EXCEPTION_RECORD
   +0x000 ExceptionCode    : Int4B
   +0x004 ExceptionFlags   : Uint4B
   +0x008 ExceptionRecord  : Ptr64 _EXCEPTION_RECORD
   +0x010 ExceptionAddress : Ptr64 Void
   +0x018 NumberParameters : Uint4B
   +0x020 ExceptionInformation : [15] Uint8B

So you can see the field offsets and such. (Less powerful because that's about it, no methods, for example).

But you can also pass it an address, which it then formats the block of memory as that type. e.g.

0:000> dt ntdll!_TEB32 00000000007ed000
   +0x000 NtTib            : _NT_TIB32
   +0x01c EnvironmentPointer : 0
   +0x020 ClientId         : _CLIENT_ID32
   +0x028 ActiveRpcHandle  : 0
   +0x02c ThreadLocalStoragePointer : 0
   +0x030 ProcessEnvironmentBlock : 0x7ed000
   +0x034 LastErrorValue   : 0
   +0x038 CountOfOwnedCriticalSections : 0
   +0x03c CsrClientThread  : 0
   +0x040 Win32ThreadInfo  : 0x4d18
   +0x044 User32Reserved   : [26] 0
   +0x0ac UserReserved     : [5] 0

So up on my list of things to implement is these two features.

The first of the commands you showed presents info we definitely should add to type lookup. I actually have a bug around to do that, but it hasn't risen to the top of my queue because it's trivial to do with the SB API's so every time I've needed that info I get it from script. Selfish me...

The second command is done in lldb with "memory read -t TYPE", and you can also use the "-c COUNT" argument to treat the memory as an array of COUNT elements of the TYPE:

(lldb) fr v myFoo
(foo *) myFoo = 0x0000000100400000
(lldb) memory read -t foo 0x0000000100400000
(foo) 0x100400000 = {
  First = 0
  Second = 0
}
(lldb) memory read -t foo 0x0000000100400000 -c 10
(foo) 0x100400000 = {
  First = 0
  Second = 0
}
(foo) 0x100400008 = {
  First = 1
  Second = 10
}
(foo) 0x100400010 = {
  First = 2
  Second = 20
}

if you want to also see the types of all the subelements add the -T flag, and if you want to see all the memory locations of the subelements, add the -L flag:

(lldb) memory read -t foo 0x0000000100400000 -c 10 -T -L
0x0000000100400000: (foo) 0x100400000 = {
0x0000000100400000:   (int) First = 0
0x0000000100400004:   (int) Second = 0
}
0x0000000100400008: (foo) 0x100400008 = {
0x0000000100400008:   (int) First = 1
0x000000010040000c:   (int) Second = 10
}

BTW, the latter two flags have the same meaning pretty much wherever we print values (expression, frame var...)

zturner added inline comments.Oct 26 2018, 1:17 AM
lldb/source/Plugins/SymbolFile/NativePDB/PdbSymUid.h
46 ↗(On Diff #171189)

There's supposed to be a uint64_t unused : 1; here. Thanks for noticing.

lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp
913 ↗(On Diff #171189)

I'm going to be totally honest here. I don't understand this code at all. I copied it from the SymbolFilePDB implementation. It seems to work :-/

936 ↗(On Diff #171189)

TypeIndex has a constructor, but I should definitely initialize the lldb::addr_t

1406 ↗(On Diff #171189)

You know, I kept trying to write that, and I was like "what silly person on the C++ standards committee forgot to put a get() method on std::shared_ptr<>. Did they really forget this or am I just going crazy?" Turns out it's a bug in MSVC intellisense where it doesn't show it for some reason. So I took that to mean it didn't exist. Shame on me.

zturner added inline comments.Oct 26 2018, 1:22 AM
lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp
913 ↗(On Diff #171189)

Ok, I thought about this some, it's actually pretty simple. I'll add a comment.

This revision was automatically updated to reflect the committed changes.