D62503 broke Fedora 30 x86_64. That is because it fixed failing ReadMemory (due to PAGE_SIZE size reading shared library name overlapping to a next page which is accidentally unmapped) there and thus enabling to really use the XML libraries protocol which broke LLDB because then DynamicLoaderPOSIXDYLD::RendezvousBreakpointHit happens only once and any further library updates are ignored.
So D62503 should be reapplied after this fix as it is safe then.
In this patch I do not understand why there should be any special add/remove handling in ProcessGDBRemote::LoadModules, it is just a new retrieved list of libraries and loading their modules is handled by later code the same way as it always worked with legacy memory reading from the list at _r_debug.
There are some more opportunities how to unify the XML protocol vs. legacy libraries handling but I wanted to first fix this pending regression.
Details
Diff Detail
- Repository
- rLLDB LLDB
Event Timeline
lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DYLDRendezvous.cpp | ||
---|---|---|
185 | This change is not really required but I do not see why the map_addr check should differ when the remote XML protocol is in use. |
I'm not sure why DynamicLoaderPOSIXDYLD::RendezvousBreakpointHit happens only once, I thought I had fixed that here: D62168.
I don't think we can stop loading/unloading the libraries in ProcessGDBRemote::LoadModules. The windows dynamic loader relies on this and the library entry on the stop packet does as well: https://github.com/llvm-mirror/lldb/blob/9689521b477c75b52257fba9655cabefc3db676c/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp#L2374-L2375
I actually proposed a different way of loading the libraries here (and it's faster because it reduces the generation of libraries-svr4 in half): D62504. What do you think?
Then it could be Windows-specific.
and the library entry on the stop packet does as well: https://github.com/llvm-mirror/lldb/blob/9689521b477c75b52257fba9655cabefc3db676c/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp#L2374-L2375
When is this library entry being used?
For implementing fast shared libraries introspection one should implement what glibc+gdb have - STAP (SystemTap) probes: Those are few breakpoints each directly reporting the one added/removed library as one can see from GDB:
(gdb) maintenance info breakpoints (gdb) info probes Num Type Disp Enb Address What Type Provider Name Where Semaphore Object -2 shlib events keep y 0x00007ffff7fd67a5 <dl_main+8517> inf 1 stap rtld init_complete 0x00007ffff7fd67a5 /lib64/ld-linux-x86-64.so.2 -5 shlib events keep y 0x00007ffff7fe61a0 <dl_open_worker+1488> inf 1 stap rtld reloc_complete 0x00007ffff7fe61a0 /lib64/ld-linux-x86-64.so.2 -7 shlib events keep y 0x00007ffff7fe715e <_dl_close_worker+3070> inf 1 stap rtld unmap_complete 0x00007ffff7fe715e /lib64/ld-linux-x86-64.so.2
The library is part of the GDB protocol itself: https://sourceware.org/gdb/onlinedocs/gdb/Stop-Reply-Packets.html#Stop-Reply-Packets and it's to be used with conjunction of qXfer:libraries:read (or libraries-svr4 like we do here).
Interesting, I'm not familiar with STAP but will take a look. I do think however that changing how we load the libraries is outside the scope of this diff. My main goal is to add the libraries-svr4 packet, but while doing so I found a few bugs in lldb, which I'm addressing here.
Does this mean that there is a bug in lldb-server, where some memory read requests fail if they span an unreadable page. Can this also be triggered by a $m packet? Would you be interested in distilling that into a test?
I also think it would be good to move this stuff into the windows loader. Right now we have a lot of confusion about who is responsible for "loading" modules into a target. Some processes (I think mostly post-mortem plugins) do the loading themselves. Others create dynamic loader plugins to do that. And in the case of svr4, we have ProcessGDBRemote creating a DynamicLoader plugin, which then calls back into the process to do the actual loading. That seems too convoluted to me...
I think it would be better to do the loading inside the DynamicLoader plugin, and have ProcessGDBRemote::LoadModules be responsible only for sending the appropriate qXfer packet and parsing the response. The fact that loading the libraries from a LoadedModuleInfoList is largely similar for posix&windows can be handled (at a later date) by factoring the common code into a utility function, a common base class or something else...
OK, I will try to reproduce it - you are right GDB documentation says for the m packet: "The reply may contain fewer addressable memory units than requested if the server was able to read only part of the region of memory. "
And LLDB is using ReadMemory in NativeProcessProtocol::ReadMemoryWithoutTrap from GDBRemoteCommunicationServerLLGS::Handle_memory_read.
I think it would be better to do the loading inside the DynamicLoader plugin, and have ProcessGDBRemote::LoadModules be responsible only for sending the appropriate qXfer packet and parsing the response. The fact that loading the libraries from a LoadedModuleInfoList is largely similar for posix&windows can be handled (at a later date) by factoring the common code into a utility function, a common base class or something else...
IIUC you believe this patch should be more refactored? I do not have much overview of this code of LLDB (+Windows libraries in general). IIUC you propose:
- The current libraries loading code from LoadedModuleInfoList of ProcessGDBRemote::LoadModules moved to lldb/source/Plugins/Platform/Windows/
- later: DynamicLoaderPOSIXDYLD::RefreshModules and DynamicLoaderPOSIXDYLD::LoadAllCurrentModules somehow reusing that code (after it is moved back to some common library)? But that looks to me as too little code to make it worth it.
Thanks. To deterministically reproduce an unmapped page of memory, the best approach I can think of is to first mmap a larger block of memory, and then munmap a part of it.
I think it would be better to do the loading inside the DynamicLoader plugin, and have ProcessGDBRemote::LoadModules be responsible only for sending the appropriate qXfer packet and parsing the response. The fact that loading the libraries from a LoadedModuleInfoList is largely similar for posix&windows can be handled (at a later date) by factoring the common code into a utility function, a common base class or something else...
IIUC you believe this patch should be more refactored? I do not have much overview of this code of LLDB (+Windows libraries in general). IIUC you propose:
- The current libraries loading code from LoadedModuleInfoList of ProcessGDBRemote::LoadModules moved to lldb/source/Plugins/Platform/Windows/
Yes, that's more or less it. Though I am open to being convinced otherwise...
- later: DynamicLoaderPOSIXDYLD::RefreshModules and DynamicLoaderPOSIXDYLD::LoadAllCurrentModules somehow reusing that code (after it is moved back to some common library)? But that looks to me as too little code to make it worth it.
Possibly. If the same code is useful for both windows and posix, then it might make sense to put it into the base DynamicLoader class, which means we wouldn't have to create new plugins or whatnot. Or we might actually want to create a DynamicLoaderSVR4, which only works with SVR4, and have these delegate to it. But that's a discussion for another day...
OK, right, closing it - I no longer see any functionality problem. Maybe one could do some more code refactorization but I do not have time for that now anyway.
This change is not really required but I do not see why the map_addr check should differ when the remote XML protocol is in use.