This is an archive of the discontinued LLVM Phabricator instance.

[lldb] Prevent re-adding a module that is already loaded
AbandonedPublic

Authored by alvinhochun on Sep 24 2022, 1:28 AM.

Details

Summary

LLDB preloads some dependent modules when creating the target. On
Windows, when the target is run and the dependent DLLs are actually
loaded by the process, ProcessWindows::OnLoadDll calls
DynamicLoaderWindowsDYLD::OnLoadModule which in turn calls
Target::GetOrCreateModule to try to load the DLL module. However, it
did not properly handle the case of getting an already-preloaded module,
and as a result any dependent DLLs preloaded by LLDB will end up being
duplicated in the target module list and causing some commands to return
repeated results (e.g. target modules lookup).

This issue does not seem to affect Linux, because the codepath it uses,
which goes through DynamicLoader::FindModuleViaTarget, has an
additional check to find existing modules before calling
Target::GetOrCreateModule.

This patch adds an early return to Target::GetOrCreateModule for when
it finds a pre-existing module already in the target module list to fix
the issue for Windows.

Diff Detail

Event Timeline

alvinhochun created this revision.Sep 24 2022, 1:28 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 24 2022, 1:28 AM
alvinhochun requested review of this revision.Sep 24 2022, 1:28 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 24 2022, 1:28 AM

@mstorsjo Can you please try the test I added and check that it passes with the patch and fails without it? (I don't have a setup to run tests on Windows.)

I probably misunderstand the situation DynamicLoaderWindows finds itself in, but in DynamicLoaderDarwin we have similar behavior - you run 'lldb binary' and lldb loads all the directly linked dynamic libraries into the target it creates, pre-execution. When execution starts, the dynamic linker tells us where the binary is loaded in memory and we call Target::SetExecutableModule with it. Target::SetExecutableModule has a side effect of clearing the Target's module list - you now have only one binary in the Target, your executable module. (this is done so the 0th image in the image list is your executable module)

Why aren't your pre-loaded DLLs unloaded when you call Target::SetExecutableModule?

I'll have to admit, a quick read through of Target::GetOrAddModule() worries me about adding this mid-function return - the code clearly has a mix of cases where it has found a Module on disk that is new to lldb and is loading it, it has found a module in lldb's global module cache which has the same UUID and/or filepath, and something in there about a Target which already has a module and the newly loaded module is used to replace it - not sure what that's about, but I only read it quickly. My initial impression is that this change is unlikely to be the right thing to do, tbh, regardless of the more basic "why are we in this situation in the first place". Again, I might be mistaken/not understanding the issue properly though.

Thanks for the comment.

I probably misunderstand the situation DynamicLoaderWindows finds itself in, but in DynamicLoaderDarwin we have similar behavior - you run 'lldb binary' and lldb loads all the directly linked dynamic libraries into the target it creates, pre-execution. When execution starts, the dynamic linker tells us where the binary is loaded in memory and we call Target::SetExecutableModule with it. Target::SetExecutableModule has a side effect of clearing the Target's module list - you now have only one binary in the Target, your executable module. (this is done so the 0th image in the image list is your executable module)

Why aren't your pre-loaded DLLs unloaded when you call Target::SetExecutableModule?

In ProcessWindows::OnDebuggerConnected, it first checks GetTarget().GetExecutableModule(). Only when the returned module is null (i.e. the binary hasn't already been loaded) then it calls GetTarget().SetExecutableModule(module, eLoadDependentsNo). If I understood you correctly, then the right thing to do there should be to call GetTarget().SetExecutableModule(module, eLoadDependentsNo) in all cases, without checking GetExecutableModule, right?

It seems to make sense, but I may need some comments from other reviewers on this.

I'll have to admit, a quick read through of Target::GetOrAddModule() worries me about adding this mid-function return - the code clearly has a mix of cases where it has found a Module on disk that is new to lldb and is loading it, it has found a module in lldb's global module cache which has the same UUID and/or filepath, and something in there about a Target which already has a module and the newly loaded module is used to replace it - not sure what that's about, but I only read it quickly. My initial impression is that this change is unlikely to be the right thing to do, tbh, regardless of the more basic "why are we in this situation in the first place". Again, I might be mistaken/not understanding the issue properly though.

I checked again and I think you are probably correct. Other than !did_create_module it seems an additional condition && m_images.FindModule(module_sp.get()) will be needed for it to do the right thing.

Updated condition for early return

mstorsjo added inline comments.Sep 26 2022, 2:14 AM
lldb/test/Shell/Target/dependent-modules-nodupe-windows.test
8

Unfortunately, this aspect of the test doesn't work as is on Windows.

By default (in msvc builds), clang invokes link.exe here, but link.exe can't take %t.shlib.dll as input file to the linker, as it requires an import library.

If we would assume that lld is available, we could run the linking command with -fuse-ld=lld - however that on its own isn't enough, as lld only allows linking against DLLs when run with the -lldmingw flag. We can add -fuse-ld=lld -Wl,-lldmingw here to fix that, but that would break testing in mingw environments, as -lldmingw isn't a valid option on the mingw lld driver.

Likewise, we could create a proper import library by adding -Wl,-implib:%t.shlib.lib on the first command above, but that doesn't work in mingw mode either, as it would have to be -Wl,--out-implib=%t.shlib.lib instead.

In practice, running the lldb tests in mingw mode is not entirely supported, while they do pass cleanly in MSVC mode (and there's a buildbot testing this configuration) - but I would like to work towards making things work better in the mingw configuration too.

There's a different substitution, %build, which invokes the lldb/test/Shell/helpers/build.py script, which abstracts a bunch of boilerplate details mostly relevant for windows targets (like creating PDB debug info files); the easiest at this point would probably be to extend that script with options for creating import libraries.

For testing with mingw, I'm currently using this out of tree patch for that script too:

diff --git a/lldb/test/Shell/helper/build.py b/lldb/test/Shell/helper/build.py
index 96684b7b3e66..f138b00bee9e 100755
--- a/lldb/test/Shell/helper/build.py
+++ b/lldb/test/Shell/helper/build.py
@@ -191,7 +191,10 @@ def find_toolchain(compiler, tools_dir):
     if compiler == 'any':
         priorities = []
         if sys.platform == 'win32':
-            priorities = ['clang-cl', 'msvc', 'clang', 'gcc']
+            if 'gcc' in sys.version.lower():
+                priorities = ['clang', 'gcc', 'clang-cl', 'msvc']
+            else:
+                priorities = ['clang-cl', 'msvc', 'clang', 'gcc']
         else:
             priorities = ['clang', 'gcc', 'clang-cl']
         for toolchain in priorities:

(This is a bit hacky, since it uses the build type of the python interpreter to switch the preference between clang-cl and clang.)

Alternatively, we could maybe add a separate substitution that expands into either -implib: or --out-implib=, but I'm not sure if we can guess which one will be the right one either, while the build.py helper script probably can do better.

mstorsjo added inline comments.Sep 26 2022, 2:33 AM
lldb/test/Shell/Target/dependent-modules-nodupe-windows.test
8

I forgot to mention; see https://reviews.llvm.org/D129455 for some earlier discussion about other aspects (symbol table vs dwarf vs PDB) for windows testcases in lldb.

alvinhochun added inline comments.Sep 26 2022, 3:34 AM
lldb/test/Shell/Target/dependent-modules-nodupe-windows.test
8

Re different options on msvc vs mingw, how about using the %if substitution (https://www.llvm.org/docs/TestingGuide.html)? We can use it to check for the windows-msvc feature to apply options conditionally.

alvinhochun added inline comments.Sep 26 2022, 3:40 AM
lldb/test/Shell/Target/dependent-modules-nodupe-windows.test
8

The commands should become this:

# RUN: %clang_host -g0 -O0 -shared %S/Inputs/shlib.c -o %t.shlib.dll %if windows-msvc %{-Wl,-implib:%t.shlib.lib}
# RUN: %clang_host -g0 -O0 %S/Inputs/main.c -o %t.main.exe %if windows-msvc %{%t.shlib.lib} %else %{%t.shlib.dll}
mstorsjo added inline comments.Sep 26 2022, 3:40 AM
lldb/test/Shell/Target/dependent-modules-nodupe-windows.test
8

Oh, I guess that would work too - it does litter the individual tests with the conditions instead of hiding it in build.py, but probably is acceptable too.

mstorsjo added inline comments.Sep 26 2022, 3:49 AM
lldb/test/Shell/Target/dependent-modules-nodupe-windows.test
8

This form works, and keeps it to just one conditional:

# RUN: %clang_host -g0 -O0 -shared %S/Inputs/shlib.c -o %t.shlib.dll %if windows-msvc %{-Wl,-implib:%t.shlib.lib%} %else %{-Wl,--out-implib=%t.shlib.lib%}
# RUN: %clang_host -g0 -O0 %S/Inputs/main.c %t.shlib.lib -o %t.main.exe
alvinhochun added inline comments.Sep 26 2022, 3:54 AM
lldb/test/Shell/Target/dependent-modules-nodupe-windows.test
8

Oh yeah, this does feel nicer. Have you already tested it?

mstorsjo added inline comments.Sep 26 2022, 4:00 AM
lldb/test/Shell/Target/dependent-modules-nodupe-windows.test
8

Yes, I tested the form I quoted above, and it works with both msvc and mingw based setups.

Updated test, thanks to @mstorsjo

mstorsjo added inline comments.Sep 26 2022, 4:12 AM
lldb/test/Shell/Target/dependent-modules-nodupe-windows.test
8

This closing } is lacking the preceding %, i.e. it should be %}, like on the second row.

Fixed test again

I probably misunderstand the situation DynamicLoaderWindows finds itself in, but in DynamicLoaderDarwin we have similar behavior - you run 'lldb binary' and lldb loads all the directly linked dynamic libraries into the target it creates, pre-execution. When execution starts, the dynamic linker tells us where the binary is loaded in memory and we call Target::SetExecutableModule with it. Target::SetExecutableModule has a side effect of clearing the Target's module list - you now have only one binary in the Target, your executable module. (this is done so the 0th image in the image list is your executable module)

Why aren't your pre-loaded DLLs unloaded when you call Target::SetExecutableModule?

In ProcessWindows::OnDebuggerConnected, it first checks GetTarget().GetExecutableModule(). Only when the returned module is null (i.e. the binary hasn't already been loaded) then it calls GetTarget().SetExecutableModule(module, eLoadDependentsNo). If I understood you correctly, then the right thing to do there should be to call GetTarget().SetExecutableModule(module, eLoadDependentsNo) in all cases, without checking GetExecutableModule, right?

It seems to make sense, but I may need some comments from other reviewers on this.

I made a follow-up patch D134636 for this.

Thanks for the comment.

I probably misunderstand the situation DynamicLoaderWindows finds itself in, but in DynamicLoaderDarwin we have similar behavior - you run 'lldb binary' and lldb loads all the directly linked dynamic libraries into the target it creates, pre-execution. When execution starts, the dynamic linker tells us where the binary is loaded in memory and we call Target::SetExecutableModule with it. Target::SetExecutableModule has a side effect of clearing the Target's module list - you now have only one binary in the Target, your executable module. (this is done so the 0th image in the image list is your executable module)

Why aren't your pre-loaded DLLs unloaded when you call Target::SetExecutableModule?

In ProcessWindows::OnDebuggerConnected, it first checks GetTarget().GetExecutableModule(). Only when the returned module is null (i.e. the binary hasn't already been loaded) then it calls GetTarget().SetExecutableModule(module, eLoadDependentsNo). If I understood you correctly, then the right thing to do there should be to call GetTarget().SetExecutableModule(module, eLoadDependentsNo) in all cases, without checking GetExecutableModule, right?

It seems to make sense, but I may need some comments from other reviewers on this.

Just my opinion, but I know how DynamicDarwinLoader works is that when it starts the actual debug session, it clears the image list entirely (iirc or maybe it just calls Target::SetExecutableModule - which is effectively the same thing). I don't know how Windows works, but on Darwin we pre-load the binaries we THINK will be loaded, but when the process actually runs, different binaries may end up getting loaded, and we need to use what the dynamic linker tells us instead of our original logic. GetTarget().SetExecutableModule(module, eLoadDependentsNo) would be one option, or clear the list and start adding, effectively the same thing. I think it would be more straightforward than adding this change to Target::GetOrAddModule.

Thanks for the comment.

I probably misunderstand the situation DynamicLoaderWindows finds itself in, but in DynamicLoaderDarwin we have similar behavior - you run 'lldb binary' and lldb loads all the directly linked dynamic libraries into the target it creates, pre-execution. When execution starts, the dynamic linker tells us where the binary is loaded in memory and we call Target::SetExecutableModule with it. Target::SetExecutableModule has a side effect of clearing the Target's module list - you now have only one binary in the Target, your executable module. (this is done so the 0th image in the image list is your executable module)

Why aren't your pre-loaded DLLs unloaded when you call Target::SetExecutableModule?

In ProcessWindows::OnDebuggerConnected, it first checks GetTarget().GetExecutableModule(). Only when the returned module is null (i.e. the binary hasn't already been loaded) then it calls GetTarget().SetExecutableModule(module, eLoadDependentsNo). If I understood you correctly, then the right thing to do there should be to call GetTarget().SetExecutableModule(module, eLoadDependentsNo) in all cases, without checking GetExecutableModule, right?

It seems to make sense, but I may need some comments from other reviewers on this.

Just my opinion, but I know how DynamicDarwinLoader works is that when it starts the actual debug session, it clears the image list entirely (iirc or maybe it just calls Target::SetExecutableModule - which is effectively the same thing). I don't know how Windows works, but on Darwin we pre-load the binaries we THINK will be loaded, but when the process actually runs, different binaries may end up getting loaded, and we need to use what the dynamic linker tells us instead of our original logic. GetTarget().SetExecutableModule(module, eLoadDependentsNo) would be one option, or clear the list and start adding, effectively the same thing.

Sounds right. On Windows, events from the debug loop will tell us which DLLs are actually being loaded by the process and we add them to the module list.

I think it would be more straightforward than adding this change to Target::GetOrAddModule.

Here's the thing, even if we change the Windows debugging support to clear the module list when starting the process, the logic of Target::GetOrAddModule still appears to be flawed if it can result in duplicated modules in the target module list, so IMO it needs to be fixed regardless.

In ProcessWindows::OnDebuggerConnected, it first checks GetTarget().GetExecutableModule(). Only when the returned module is null (i.e. the binary hasn't already been loaded) then it calls GetTarget().SetExecutableModule(module, eLoadDependentsNo). If I understood you correctly, then the right thing to do there should be to call GetTarget().SetExecutableModule(module, eLoadDependentsNo) in all cases, without checking GetExecutableModule, right?

It seems to make sense, but I may need some comments from other reviewers on this.

Just my opinion, but I know how DynamicDarwinLoader works is that when it starts the actual debug session, it clears the image list entirely (iirc or maybe it just calls Target::SetExecutableModule - which is effectively the same thing). I don't know how Windows works, but on Darwin we pre-load the binaries we THINK will be loaded, but when the process actually runs, different binaries may end up getting loaded, and we need to use what the dynamic linker tells us instead of our original logic. GetTarget().SetExecutableModule(module, eLoadDependentsNo) would be one option, or clear the list and start adding, effectively the same thing.

Sounds right. On Windows, events from the debug loop will tell us which DLLs are actually being loaded by the process and we add them to the module list.

Seems reasonable to me. I'm not actually sure if we're doing this on linux/posix (as I still see duplicated vdso modules occasionaly -- but this could be caused by other problems as well), but if we're not -- we probably should.

I think it would be more straightforward than adding this change to Target::GetOrAddModule.

Here's the thing, even if we change the Windows debugging support to clear the module list when starting the process, the logic of Target::GetOrAddModule still appears to be flawed if it can result in duplicated modules in the target module list, so IMO it needs to be fixed regardless.

I can totally believe that there is a bug in the GetOrAddModule, but I would not be able to tell you if this is the right fix or not. As for multiple modules, I can say that on linux it is actually possible to load the same shared library more than once (into different "linker namespaces"). LLDB does not currently support that, but I would like to support it one day. I don't know whether this change would make that harder or easier, though.

Was the consensus that we'd drop this patch?

alvinhochun abandoned this revision.Oct 21 2022, 1:46 AM

Since the original issue was fixed in a different patch I don't think I will follow up on this anyway.