When a process is loaded, update its sections with the load address to resolve any created breakpoints. For the remote debugging case, the debugged process is launched remotely so GetLoadAddress is intended to pass the load address from remote to LLDB (client).
Details
Diff Detail
Event Timeline
We should probably have a test for this. It sounds like all you need to do to test this is create a target, set a breakpoint, run the process, then verify that you see some text that says the breakpoint was resolved. With the new build.py script this should be able to work in a cross-platform manner, so maybe it can go under lit/ExecControl and use compiler=any (or don't specify a compiler at all, since that's the default).
I think there is something wrong here. Normally, it should be the job of the dynamic loader plugin to figure out where a module got loaded (and populate the target section load list). Process::GetFileLoadAddress is there to help it with this job. So, the fact that ProcessWindows::GetFileLoadAddress goes back to the target section load list to get the data creates a weird loop. The only way I can see this working is if this implementation happens to return the module base address, as specified in the object file, and that base address happens to be correct (no ASLR). The kind of code I would expect to see here is a call to some windows API to determine the load address straight from the OS (if such a thing is possible). For example, on Linux, we parse the /proc/<pid>/maps file to determine this information.
source/Plugins/Process/Windows/Common/ProcessWindows.cpp | ||
---|---|---|
878 ↗ | (On Diff #179977) | I think FileOffset will always be zero here. The only case where this is not zero is for fat macho binaries. And even in that case, I don't think adding that offset to the load address is the right thing to do. |
886 ↗ | (On Diff #179977) | typo |
Seems like DynamicLoaderWindowsDYLD isn't doing its job correctly here and DynamicLoaderWindowsDYLD should be fixed. We shouldn't have to go to the process at all in order to set the section load addresses correctly. Why? As you might have seen lldb-server.exe is being worked on now and then we will have two clients for DynamicLoaderWindowsDYLD: the native ProcessWindows plug-in and ProcessGDBRemote. The
source/Plugins/Process/Windows/Common/ProcessWindows.cpp | ||
---|---|---|
858 ↗ | (On Diff #179977) | This entire function is doing the job of the dynamic loader and should be moved into the dynamic loader. |
So it seems like DynamicLoaderWindowsDYLD is not doings its job correctly and it needs to be fixed. DynamicLoaderWindowsDYLD is responsible for setting all of the section load addresses correctly using an abstract lldb_private::Process plug-in. No special calls should be added if that is possible. We have someone working on lldb-server.exe and then we will have two clients for DynamicLoaderWindowsDYLD: ProcessWindows and ProcessGDBRemote. DynamicLoaderWindowsDYLD should work for any lldb_private::Process subclass without the need to call back into a specific lldb_private::Process subclass method.
source/Plugins/Process/Windows/Common/ProcessWindows.cpp | ||
---|---|---|
858 ↗ | (On Diff #179977) | I am saying the DynamicLoaderWindowsDYLD should be fixed, so that no matter how we connect to a windows process, we get correct DYLD functionality. The way this patch is currently coded, you require every process plug-in that connects to windows processes to have this work around. |
The Process::GetFileLoadAddress is needed for both process clients mentioned here. For the remote process, it will be the remote platform's responsibility to provide the load address. The remote platform could be lldb-server or other server eligible to connect to lldb. Currently the processwindows plugin knows about the load address since the plugin is notified the process base address (or say the randomized image base) by windows debug event when the process is loaded on Windows. So the processwindows::GetFileLoadAddress actually does a job for dyld's convenience. Later on, if a native windows process is just a pass-through to processwindows, it will benefit the same feature (I am thinking processwindows is working as a common implementation for windows specific).
It is doable on windows to get the process base address by enumerate the process's module. I tried the following. The load address has the same result as processwindows::GetFileLoaddAddress().
+lldb::addr_t Host::GetProcessBaseAddress(lldb::pid_t pid) { + AutoHandle snapshot(CreateToolhelp32Snapshot(TH32CS_SNAPMODULE, pid)); + if (!snapshot.IsValid()) + return LLDB_INVALID_ADDRESS; + + MODULEENTRY32W me; + me.dwSize = sizeof(MODULEENTRY32W); + if (Module32FirstW(snapshot.get(), &me)) { + // The first module is always the EXE or DLL itself. + return (addr_t)me.modBaseAddr; + } + + return LLDB_INVALID_ADDRESS; +} +
I think there are some architectural issues here in how ProcessWindows works. Normally (i.e., on all non-windows process classes), it is the job of the dynamic loader plugin to create modules and set their load addresses. However, ProcessWindows does things differently, and creates the modules by itself (ProcessWindows::OnLoadDLL).
I think the behavior of ProcessWindows is fine (it flows naturally from how modules are handled on windows), but if it's going to handle module loading by itself, then I think it should do the job completely, and not only half-way. So, if we want ProcessWindows to do the module loading, then it should return a null value for GetDynamicLoader() (or a reasonably stubbed out implementation) instead of relying on a generic dynamic loader plugin to finish the job.
Or, if we want to have the DynamicLoader plugin, then ProcessWindows should stop meddling with the module list in the OnLoadDLL function. One way to achieve that would be:
- ProcessWindows::OnLoadDLL calls DynamicLoaderWindowsDYLD::OnLoadDll
- dynamic loader loads the module and sets load address
- dynamic loader does not have to call ProcessWindows::GetFileLoadAddress as it already got the load address from in the OnLoadDLL function
This would make the flow very similar to how other platforms handle it (the only difference would be that the "OnLoadDLL" equivalent is called from a breakpoint callback function, instead of from a special debug event callback, but that's just due to a difference in how modules are reported).
I would like the see the dynamic loader do all of the work. We can easily add any new virtual functions to the DynamicLoader class to fit how windows does things.
Going to the Host straight from the DynamicLoader is the worst possible option, since then this code will not be correct for remote processes (it will just randomly pick up some data from the local process with the same pid, if you happen to have one). I'd like to hear what you think about the latest comments by Greg & myself.
That should be fairly easy to achieve. The process class is the one that creates the dynamic loader (in GetDynamicLoader()), so it can easily call any special method on it that it needs from OnLoadDLL (i.e., we don't even need to extend the public DynamicLoader API).
However, note that this would not be the first process class that handles module loading on its own. ProcessMinidump already does that too. It could be changed to use a special dynamic loader too, but it's not clear to me whether that is worth the trouble, as the loading process for minidumps is very simple.
source/Plugins/DynamicLoader/Windows-DYLD/DynamicLoaderWindowsDYLD.cpp | ||
---|---|---|
82 | Shall not come to this point for a remote process. Need to bail out ahead. |
source/Plugins/DynamicLoader/Windows-DYLD/DynamicLoaderWindowsDYLD.cpp | ||
---|---|---|
82 | Yes, but what's the reason for the fallback in the first place? ProcessWindows is already by definition on the host, so you could do the same work in ProcessWindows::GetFileLoadAddress (but I'm not saying you should, since that information should already be known through ProcessWindows, as it received it through the windows debug api in OnLoadDLL ). |
I am thinking getting load address and setting load address by dyld could be done in separate reviews? For the latter, it might lead to some architecture changes in processwindows plugin.
For this review, getting load address of a host process can just leverage the OS API if no processwindows plugin's help is wanted.
Servers other than lldb-server.exe can easily implement the qFileLoadAddress packet (using their own loader) as a reply to m_process->GetFileLoadAddress().
source/Plugins/DynamicLoader/Windows-DYLD/DynamicLoaderWindowsDYLD.cpp | ||
---|---|---|
82 | Server other lldb-server.exe could respond with a bogus address to qFileLoadAddress packet and then lldb (on host) will come to this point and result in the case you mentioned.
|
Yes, two reviews would be great. However, the way I see this, the second patch would basically remove the need for the special Host function you're trying to introduce here. So I'd suggest to do the second part first, and then this should be very simple addon on top of that.
Servers other than lldb-server.exe can easily implement the qFileLoadAddress packet (using their own loader) as a reply to m_process->GetFileLoadAddress().
I am aware of that. However, my point is that you shouldn't need *any* fallback here. If ProcessWindows::GetFileLoadAddress does the right thing, then all the dyld plugin would not need to call any host functions *at all* (which is just a fundamentally wrong thing to do).
I like the direction this is going. I have one small comment about the code organization inline, but the bigger question I have in mind is: After these changes, is there any use for Host::GetProcessBaseAddress left? If so, what is it? I'd like to understand it and remove it.
source/Plugins/Process/Windows/Common/ProcessWindows.cpp | ||
---|---|---|
868 ↗ | (On Diff #186401) | You could have this function return DynamicLoaderWindowsDYLD * (thanks to the covariant return type thingy), and avoid having to cast the result in a bunch of places. |
1039–1044 ↗ | (On Diff #186401) | On other platforms, it is the responsibility of the DYLD plugin to add the module to the target list (i.e. call Target::GetSharedModule or something equivalent). I think it would be more consistent to move this code there too. At that point you may just delete ProcessWindows::OnLoadDll and have the debugger thread call process_windows->GetDynamicLoader()->OnLoadDLL() directly. The same goes for module UnloadDLL. |
From my understanding, lldb-server.exe is to debug a process instantiated by NativeProcessProtocol which might or might not relate to ProcessWindows plugin.
To remove the host codes mentioned above, it requires the implementation of NativeProcessProtocol also notify the DYLD the load address of the main executable otherwise
DYLD has to do the job with the OS help. For now I think just remove them.
Yes, the NativeProcessWindows will need to notify the DYLD about the load address somehow. If DYLD went and did that on it's own, it would kill the "remote" aspect of lldb-server, as the DYLD plugin lives in the client. I think @sas is already passing that information somehow with ds2, so it might be good to sync up with him, so we don't reinvent something new. (My guess would be that they just have ds2 implement the packet underlying the GetFileLoadAddress() call.)
It short, let's remove the host code for now, and we can discuss that again after we get around to implementing NativeProcessWindows or whatever.
Looks good to me now. I think that, for completeness, you ought to add the DYLD plugin to the list of dependencies for in the CMakeLists.txt for ProcessWindows. Thanks for putting up with me.
source/Plugins/Process/Windows/Common/ProcessWindows.h | ||
---|---|---|
94 ↗ | (On Diff #186796) | add override |
Shall not come to this point for a remote process. Need to bail out ahead.