This is an archive of the discontinued LLVM Phabricator instance.

Unify+fix remote XML libraries handling with the legacy one
AbandonedPublic

Authored by jankratochvil on Jun 27 2019, 4:01 AM.

Details

Summary

D62503 broke Fedora 30 x86_64. That is because it fixed failing ReadMemory (due to PAGE_SIZE size reading shared library name overlapping to a next page which is accidentally unmapped) there and thus enabling to really use the XML libraries protocol which broke LLDB because then DynamicLoaderPOSIXDYLD::RendezvousBreakpointHit happens only once and any further library updates are ignored.
So D62503 should be reapplied after this fix as it is safe then.
In this patch I do not understand why there should be any special add/remove handling in ProcessGDBRemote::LoadModules, it is just a new retrieved list of libraries and loading their modules is handled by later code the same way as it always worked with legacy memory reading from the list at _r_debug.
There are some more opportunities how to unify the XML protocol vs. legacy libraries handling but I wanted to first fix this pending regression.

Diff Detail

Repository
rLLDB LLDB

Event Timeline

jankratochvil created this revision.Jun 27 2019, 4:01 AM
jankratochvil marked an inline comment as done.Jun 27 2019, 4:02 AM
jankratochvil added inline comments.
lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DYLDRendezvous.cpp
185

This change is not really required but I do not see why the map_addr check should differ when the remote XML protocol is in use.

jankratochvil retitled this revision from Unify+fix remote XML libraries handling with legacy one to Unify+fix remote XML libraries handling with the legacy one.Jun 27 2019, 4:03 AM
jankratochvil edited the summary of this revision. (Show Details)Jun 27 2019, 4:30 AM
aadsm added a comment.Jun 27 2019, 7:39 AM

I'm not sure why DynamicLoaderPOSIXDYLD::RendezvousBreakpointHit happens only once, I thought I had fixed that here: D62168.
I don't think we can stop loading/unloading the libraries in ProcessGDBRemote::LoadModules. The windows dynamic loader relies on this and the library entry on the stop packet does as well: https://github.com/llvm-mirror/lldb/blob/9689521b477c75b52257fba9655cabefc3db676c/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp#L2374-L2375

I actually proposed a different way of loading the libraries here (and it's faster because it reduces the generation of libraries-svr4 in half): D62504. What do you think?

I don't think we can stop loading/unloading the libraries in ProcessGDBRemote::LoadModules. The windows dynamic loader relies on this

Then it could be Windows-specific.

and the library entry on the stop packet does as well: https://github.com/llvm-mirror/lldb/blob/9689521b477c75b52257fba9655cabefc3db676c/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp#L2374-L2375

When is this library entry being used?

For implementing fast shared libraries introspection one should implement what glibc+gdb have - STAP (SystemTap) probes: Those are few breakpoints each directly reporting the one added/removed library as one can see from GDB:

(gdb) maintenance info breakpoints
(gdb) info probes
Num     Type                  Disp Enb Address            What
Type Provider   Name                   Where              Semaphore Object   
-2      shlib events          keep y   0x00007ffff7fd67a5 <dl_main+8517> inf 1
stap rtld       init_complete          0x00007ffff7fd67a5           /lib64/ld-linux-x86-64.so.2
-5      shlib events          keep y   0x00007ffff7fe61a0 <dl_open_worker+1488> inf 1
stap rtld       reloc_complete         0x00007ffff7fe61a0           /lib64/ld-linux-x86-64.so.2
-7      shlib events          keep y   0x00007ffff7fe715e <_dl_close_worker+3070> inf 1
stap rtld       unmap_complete         0x00007ffff7fe715e           /lib64/ld-linux-x86-64.so.2
aadsm added a comment.Jun 27 2019, 9:25 AM

The library is part of the GDB protocol itself: https://sourceware.org/gdb/onlinedocs/gdb/Stop-Reply-Packets.html#Stop-Reply-Packets and it's to be used with conjunction of qXfer:libraries:read (or libraries-svr4 like we do here).

Interesting, I'm not familiar with STAP but will take a look. I do think however that changing how we load the libraries is outside the scope of this diff. My main goal is to add the libraries-svr4 packet, but while doing so I found a few bugs in lldb, which I'm addressing here.

labath added a comment.Jul 1 2019, 1:58 AM

That is because it fixed failing ReadMemory (due to PAGE_SIZE size reading shared library name overlapping to a next page which is accidentally unmapped)

Does this mean that there is a bug in lldb-server, where some memory read requests fail if they span an unreadable page. Can this also be triggered by a $m packet? Would you be interested in distilling that into a test?

I don't think we can stop loading/unloading the libraries in ProcessGDBRemote::LoadModules. The windows dynamic loader relies on this

Then it could be Windows-specific.

I also think it would be good to move this stuff into the windows loader. Right now we have a lot of confusion about who is responsible for "loading" modules into a target. Some processes (I think mostly post-mortem plugins) do the loading themselves. Others create dynamic loader plugins to do that. And in the case of svr4, we have ProcessGDBRemote creating a DynamicLoader plugin, which then calls back into the process to do the actual loading. That seems too convoluted to me...

I think it would be better to do the loading inside the DynamicLoader plugin, and have ProcessGDBRemote::LoadModules be responsible only for sending the appropriate qXfer packet and parsing the response. The fact that loading the libraries from a LoadedModuleInfoList is largely similar for posix&windows can be handled (at a later date) by factoring the common code into a utility function, a common base class or something else...

Does this mean that there is a bug in lldb-server, where some memory read requests fail if they span an unreadable page. Can this also be triggered by a $m packet? Would you be interested in distilling that into a test?

OK, I will try to reproduce it - you are right GDB documentation says for the m packet: "The reply may contain fewer addressable memory units than requested if the server was able to read only part of the region of memory. "

And LLDB is using ReadMemory in NativeProcessProtocol::ReadMemoryWithoutTrap from GDBRemoteCommunicationServerLLGS::Handle_memory_read.

I think it would be better to do the loading inside the DynamicLoader plugin, and have ProcessGDBRemote::LoadModules be responsible only for sending the appropriate qXfer packet and parsing the response. The fact that loading the libraries from a LoadedModuleInfoList is largely similar for posix&windows can be handled (at a later date) by factoring the common code into a utility function, a common base class or something else...

IIUC you believe this patch should be more refactored? I do not have much overview of this code of LLDB (+Windows libraries in general). IIUC you propose:

  • The current libraries loading code from LoadedModuleInfoList of ProcessGDBRemote::LoadModules moved to lldb/source/Plugins/Platform/Windows/
  • later: DynamicLoaderPOSIXDYLD::RefreshModules and DynamicLoaderPOSIXDYLD::LoadAllCurrentModules somehow reusing that code (after it is moved back to some common library)? But that looks to me as too little code to make it worth it.
labath added a comment.Jul 1 2019, 3:09 AM

Does this mean that there is a bug in lldb-server, where some memory read requests fail if they span an unreadable page. Can this also be triggered by a $m packet? Would you be interested in distilling that into a test?

OK, I will try to reproduce it - you are right GDB documentation says for the m packet: "The reply may contain fewer addressable memory units than requested if the server was able to read only part of the region of memory. "

And LLDB is using ReadMemory in NativeProcessProtocol::ReadMemoryWithoutTrap from GDBRemoteCommunicationServerLLGS::Handle_memory_read.

Thanks. To deterministically reproduce an unmapped page of memory, the best approach I can think of is to first mmap a larger block of memory, and then munmap a part of it.

I think it would be better to do the loading inside the DynamicLoader plugin, and have ProcessGDBRemote::LoadModules be responsible only for sending the appropriate qXfer packet and parsing the response. The fact that loading the libraries from a LoadedModuleInfoList is largely similar for posix&windows can be handled (at a later date) by factoring the common code into a utility function, a common base class or something else...

IIUC you believe this patch should be more refactored? I do not have much overview of this code of LLDB (+Windows libraries in general). IIUC you propose:

  • The current libraries loading code from LoadedModuleInfoList of ProcessGDBRemote::LoadModules moved to lldb/source/Plugins/Platform/Windows/

Yes, that's more or less it. Though I am open to being convinced otherwise...

  • later: DynamicLoaderPOSIXDYLD::RefreshModules and DynamicLoaderPOSIXDYLD::LoadAllCurrentModules somehow reusing that code (after it is moved back to some common library)? But that looks to me as too little code to make it worth it.

Possibly. If the same code is useful for both windows and posix, then it might make sense to put it into the base DynamicLoader class, which means we wouldn't have to create new plugins or whatnot. Or we might actually want to create a DynamicLoaderSVR4, which only works with SVR4, and have these delegate to it. But that's a discussion for another day...

labath added a comment.Aug 9 2019, 1:52 AM

This should be no longer relevant, right?

jankratochvil abandoned this revision.Aug 9 2019, 1:54 AM

OK, right, closing it - I no longer see any functionality problem. Maybe one could do some more code refactorization but I do not have time for that now anyway.