This is an archive of the discontinued LLVM Phabricator instance.

Fix TestBreakpointInGlobalConstructor for Windows
ClosedPublic

Authored by amccarth on Feb 16 2018, 3:53 PM.

Details

Summary

This test was failing on Windows because it expected the breakpoint in the
dynamic library to be resolved before the process is launched. Since the DLL
isn't loaded until the process is launched this didn't work.

Changing the expectation solves the problem on Windows, but since I'm not
sure how this works on other platforms, I'm afraid this could break it there.
I'm considering whether we should treat 0 as a special value for
num_expected_locations that means 0 or more.

Event Timeline

amccarth created this revision.Feb 16 2018, 3:53 PM
zturner added a subscriber: zturner.

What's the behavior on other platforms? When you set a breakpoint in a shared library before you've run the program, shouldn't it still be unresolved, in which case this test should have failed on those platforms too?

On Darwin we load all the libraries that the binary links against pre-execution, if possible. So I see:

% lldb a.out
(lldb) ima li libfoo.dylib
[ 0] 35C5FE62-5327-3335-BBCF-5369CB07D1D6 0x0000000000000000 /Volumes/ssdhome/jmolenda/k/svn/lldb/packages/Python/lldbsuite/test/functionalities/breakpoint/global_constructor/libfoo.dylib

/Volumes/ssdhome/jmolenda/k/svn/lldb/packages/Python/lldbsuite/test/functionalities/breakpoint/global_constructor/libfoo.dylib.dSYM/Contents/Resources/DWARF/libfoo.dylib

(lldb) br s -f foo.cpp -p BR_foo
Breakpoint 1: where = libfoo.dylib`Foo::Foo() + 18 at foo.cpp:4, address = 0x0000000000000f52
(lldb) br li
Current breakpoints:
1: source regex = "BR_foo", exact_match = 0, locations = 1

1.1: where = libfoo.dylib`Foo::Foo() + 18 at foo.cpp:4, address = libfoo.dylib[0x0000000000000f52], unresolved, hit count = 0

On Darwin we load all the libraries that the binary links against pre-execution, if possible. So I see:

% lldb a.out
(lldb) ima li libfoo.dylib
[ 0] 35C5FE62-5327-3335-BBCF-5369CB07D1D6 0x0000000000000000 /Volumes/ssdhome/jmolenda/k/svn/lldb/packages/Python/lldbsuite/test/functionalities/breakpoint/global_constructor/libfoo.dylib

/Volumes/ssdhome/jmolenda/k/svn/lldb/packages/Python/lldbsuite/test/functionalities/breakpoint/global_constructor/libfoo.dylib.dSYM/Contents/Resources/DWARF/libfoo.dylib

(lldb) br s -f foo.cpp -p BR_foo
Breakpoint 1: where = libfoo.dylib`Foo::Foo() + 18 at foo.cpp:4, address = 0x0000000000000f52
(lldb) br li
Current breakpoints:
1: source regex = "BR_foo", exact_match = 0, locations = 1

1.1: where = libfoo.dylib`Foo::Foo() + 18 at foo.cpp:4, address = libfoo.dylib[0x0000000000000f52], unresolved, hit count = 0

So IIUC it's still unresolved because you don't know the load address of the dylib until runtime, but it at least has a location because you know the RVA in the dylib?

Yeah, it has a location because we're resolved it to a File address (section+offset), and when the dylib actually gets loaded we'll resolve that to a load address and insert the breakpoint instruction.

I'm not sure how hard it would be, but it would be better if we could do the same thing on Windows, for consistency of behavior. In principle it's straightforward, just scan through the IAT and load all of the imported modules. Probably the DynamicLoaderWindows plugin is doing this right now.

OK, so my fix is too simplistic. I'll have to (1) make LLDB on Windows also attempt to pre-load the DLLs, (2) make the test use platform-specific expectations, or (3) use a looser expectation. I'll give it some thought over the weekend.

I guess one advantage to delaying it is that the debug info for the dynamic library could be large and slow to parse, and you don't know if you're even going to need it until you hit the breakpoint. So by delaying the resolution even to File address, you postpone parsing potentially expensive debug info until you know you're going to need it.

It this not a consideration on OSX and/or other platforms that resolve it to a file address early?

It is really handy to see the pre-load libraries in lldb, it means you can do "source list" to make sure the breakpoints you are planning to set are where you expect them to be, and tab completion can help you set breakpoints pre-run, etc. So if you can do this, your users will be happier. If you can't get to this, then it would be better to pass different values for num_expected_locations on Windows & other platforms. Finding the breakpoint pre-launch is expected behavior on macOS & Linux, so it would be good to keep testing that that continues to work.

I guess on the other hand, it's reasonable to assume that if you've set a breakpoint somewhere, you're more likely than not to need it since you probably had a reason for setting it. Is the debug info parsed when the executable is loaded, or when the breakpoint is set?

If you want to avoid loading dependent libraries, you can use the "-d" flag to "target create". This is sometimes handy, but shouldn't be the default behavior.

The debug info is parsed only when needed. So for breakpoints, it will get parsed when the breakpoint is set. That's one good use of "break set -s <TargetBinary>" BTW, it will limit the debug info read in to just the library you expect to find the symbol in.

labath requested changes to this revision.Feb 17 2018, 4:36 AM

Yeah, setting this to zero would break other platforms that are able to locate shared libraries before running the executable. On linux, we try to locate them as well, but it's kind of on a best-effort basis -- it's quite hard to figure out what library will get loaded with absolute precision, as it can e.g. depend on the value of LD_LIBRARY_PATH env var that you run the process with (and you don't know that until the actual "process launch" command). In fact, it wouldn't be hard to come up with examples where this static resolution finds the wrong library.

So, it sounds to me like we do need a special flag for "zero or more" breakpoint locations for tests which don't care about this behavior such as this one. Maybe num_expected_locations=-2 (I think we should leave "0" meaning "exactly zero", as that can be useful in some situations). In fact I know of at least one other test that could use something like this (I've had to disable some tests in TestLoadUnload on linux because the out-of-tree build caused the automatic library lookup to not work).

Of course being able to resolve DLL dependencies statically like other platforms would be nice, but that's quite orthogonal to this what this test is doing. (Here the purpose is to check that we set the breakpoint early enough -- before the global constructor executes).

This revision now requires changes to proceed.Feb 17 2018, 4:36 AM

I have reservations about making the debugger try to pre-locate the modules and their PDBs before those modules are actually loaded. For a few reasons.

(1) On Windows, module resolution is complex. Search paths, the safe search path, manifests, the side-by-side cache, dynamic library link redirection (.exe.local), Windows API sets, etc. It got to the point that Microsoft dropped support for the venerable DependencyWalker tool (though there is an open source community trying to keep it alive). That's a lot of logic to bake into the debugger (and to create tests for), and the cost of getting it wrong is that we think your breakpoint will resolve when actually it won't.

(2) A typical program depends directly and indirectly on many DLLs. Even with lazy-evaluation, trying to apply all the rules in 1 (which must be done serially) to locate all of the dependents seems like a lot of unnecessary work on startup.

(3) It's different than how all the debuggers on Windows work, which might be mildly surprising to users.

If there's a strong will to head down this path, I think that'll be a separate effort than my getting this test working again in the short term. So I think I'll do something less invasive along the lines of Pavel's suggestion to get this test working. Stay tuned.

amccarth updated this revision to Diff 135120.Feb 20 2018, 12:01 PM

Added new special value to accommodate the fact that Windows may postpone resolving the breakpoints for DLLs that aren't yet loaded.

jingham requested changes to this revision.Feb 20 2018, 12:19 PM

At some point we should introduce "platformCapacities" so that you could do:

if not test.getPlatformCapacities("preload-dylibs"):

so we are testing specific features not the whole platform. But that seems a bigger change than this fix warrants.

It looks like your added comment has no line endings? Can you reformat the comment so it's easier to read? Then it will be good.

This revision now requires changes to proceed.Feb 20 2018, 12:19 PM
amccarth updated this revision to Diff 135124.Feb 20 2018, 12:43 PM

Wrapped the modified docstring.

At some point we should introduce "platformCapacities" so that you could do:

if not test.getPlatformCapacities("preload-dylibs"):

so we are testing specific features not the whole platform.

Yes, I agree completely.

It looks like your added comment has no line endings? Can you reformat the comment so it's easier to read? Then it will be good.

Done. I'm never sure how Python handles docstrings. The existing code looked like it was trying to avoid line breaks that weren't separating paragraphs.

I was actually thinking of making this unconditional. It makes the behaviour easier to understand and this test really does not care about whether the libraries were resolved statically or not.

I'm with Jim that we should make targeted tests for that functionality.

By unconditional, do you mean allowing any value for out_num_locations in these cases? I'm happy to do that, but I'm not sure if I've understood you correctly.

Yes, regardless of the target platform, if you specify num_locations = -2, then we just don't check the number of locations.

amccarth updated this revision to Diff 135272.Feb 21 2018, 9:03 AM

Per Pavel's suggestion, change special value to mean don't check the number of locations found.

labath accepted this revision.Feb 21 2018, 9:46 AM
This revision was not accepted when it landed; it landed in state Needs Review.Feb 21 2018, 10:10 AM
This revision was automatically updated to reflect the committed changes.