This is an archive of the discontinued LLVM Phabricator instance.

Implement xfer:libraries-svr4:read packet
ClosedPublic

Authored by aadsm on May 27 2019, 3:23 PM.

Details

Summary

This is the fourth patch to improve module loading in a series that started here (where I explain the motivation and solution): D62499

Implement the xfer:libraries-svr4 packet by adding a new function that generates the list and then in Handle_xfer I generate the XML for it. The XML is really simple so I'm just using string concatenation because I believe it's more readable than having to deal with a DOM api.

Concerns

  • I'm not sure about the having the ELFLinkMap and the SharedLibraryInfo struct in there. I thought about creating a function or class outside of NativeProcessLinux so it could be reusable in other contexts but the core functions need to use ReadMemory so it was not convinent and I guess would have forced me to pass a lambda as a wrapper to ReadMemory. This would introduce more complexity to the code and I'm not sure if it would be worthwhile.
  • On Handle_xfer function the backing buffer structure uses MemoryBuffers so I had to create one out of the StreamString response I generate with the XML. This kind of sucks because I'm copying all this data when I usually don't even need it. And if I do need it still sucks because I'm copying something that I just generated right there. I couldn't find a way to generate the XML into a MemoryBuffer (as it requires to know the size before hand) or use another structure that would make this easier. I've thought about changing the backing structure to StringRefs but that object doesn't really own the underlying "const char*" so I can't keep it without copying again. Any suggestions here? [After measuring more carefully this doesn't seem to be an issue. (<0.1ms)]

Event Timeline

aadsm created this revision.May 27 2019, 3:23 PM
aadsm edited the summary of this revision. (Show Details)May 30 2019, 3:44 PM
aadsm updated this revision to Diff 202328.May 30 2019, 3:51 PM

Update with the correct commit...

aadsm updated this revision to Diff 202652.Jun 2 2019, 10:35 PM

Address comments and add tests

labath added a comment.Jun 3 2019, 2:48 AM

Regarding the test, I am wondering whether requiring the /proc/%d/maps and the linker rendezvous structure to match isn't too strict of a requirement. Dynamic linkers (and particularly android dynamic linkers) do a lot of crazy stuff, and trying to assert this invariant exposes us to all kinds of "optimizations" that the linkers do or may do in the future. I'm wondering if it wouldn't be better to just build an executable with one or two dependent libraries, and then just assert that their entries are present in the library list. However, I'm not completely sure on this, so I'd like to hear what you think about that..

clayborg added inline comments.Jun 3 2019, 9:44 AM
lldb/include/lldb/Host/common/NativeProcessProtocol.h
37–39

These seem very linux specific. Why do we need 3 addresses here? Seems like we should just need one. Or is this the structure of the "xfer:libraries-svr4:read" that is required to be returned? If so, maybe we rename "SharedLibraryInfo" to "SVR4ModuleInfo"?

40

Why is "main" important? Hopefully the dynamic linker can figure out what is what without needing to know this? Seems verify linux specific. Or is this "xfer:libraries-svr4:read" specific?

41

Why do we need this "next" member here? We get a list of these from:

virtual Status GetLoadedSharedLibraries(std::vector<SharedLibraryInfo> &library_list);

Seems like we shouldn't need this. Darwin based systems doesn't store the shared library list as a linked list. Hopefully not "xfer:libraries-svr4:read" specific?

lldb/source/Plugins/Process/Linux/NativeProcessLinux.cpp
2191

Use ReadCStringFromMemory?

2247

Seems this we are using "info.next" just because it is a linked list on linux. Can we remove "next" and just add an extra "next" reference parameter to ReadSharedLibraryInfo<T>(link_map, info, next);?

aadsm marked 4 inline comments as done.Jun 3 2019, 5:21 PM

@labath

Regarding the test, I am wondering whether requiring the /proc/%d/maps and the linker rendezvous structure to match isn't too strict of a requirement. Dynamic linkers (and particularly android dynamic linkers) do a lot of crazy stuff, and trying to assert this invariant exposes us to all kinds of "optimizations" that the linkers do or may do in the future. I'm wondering if it wouldn't be better to just build an executable with one or two dependent libraries, and then just assert that their entries are present in the library list. However, I'm not completely sure on this, so I'd like to hear what you think about that..

Yeah, that was actually my first idea for the test but then realized we already had that main.c all set up and ready to use so I decide to use proc maps instead. I was not aware that things like that could happen (although I had a suspicion this might be brittle) so let me revisit this.

lldb/include/lldb/Host/common/NativeProcessProtocol.h
37–39

Yes, that is a good point and yes this is the data the SVR4 returns. Given this structure it makes more sense to move this into the NativeProcessLinux instead and name it more specifically.

40

It is packet specific yes. And I just realized I'm not putting that info in the XML I'm returning, will update.

lldb/source/Plugins/Process/Linux/NativeProcessLinux.cpp
2191

ReadCStringFromMemory doesn't exist here yet. That's on the next diff of this stack of diffs.

2247

Yes, this is the only reason, sounds good.

aadsm marked an inline comment as done.Jun 3 2019, 8:36 PM
aadsm added inline comments.
lldb/include/lldb/Host/common/NativeProcessProtocol.h
37–39

Now that I think more about this, SVR4 is really generic to all system v implementations (4.0) so it's not linux specific. Will just change the name to more accurately reflect what is means.

aadsm updated this revision to Diff 202855.Jun 3 2019, 10:27 PM

Address comments except the one about a different test plan.
Also removed the main property, after searching on the internet I'm not confident with the way I was using to detect the main executable. Since the "lm-main" is an optional attribute I'm not going to implement it now.
However, I'm excluding from the list the libraries without name or that are at base addr == 0. They shouldn't be there by the spec.

labath added a comment.Jun 4 2019, 2:14 AM

@labath

Regarding the test, I am wondering whether requiring the /proc/%d/maps and the linker rendezvous structure to match isn't too strict of a requirement. Dynamic linkers (and particularly android dynamic linkers) do a lot of crazy stuff, and trying to assert this invariant exposes us to all kinds of "optimizations" that the linkers do or may do in the future. I'm wondering if it wouldn't be better to just build an executable with one or two dependent libraries, and then just assert that their entries are present in the library list. However, I'm not completely sure on this, so I'd like to hear what you think about that..

Yeah, that was actually my first idea for the test but then realized we already had that main.c all set up and ready to use so I decide to use proc maps instead. I was not aware that things like that could happen (although I had a suspicion this might be brittle) so let me revisit this.

I'm not sure whether it will be brittle, but I do know that the android linker does (or used to do?) some tricks where it does not really load libdl.so, but instead inserts a fake entry into the link map which pretends to implement that library (but in reality the entry points to the linker itself). That sounds like the kind of thing that could break if you try to correlate the linker data with /proc/%d/maps.

So, I think it would be better to create a separate test binary for this, particularly as it seems some tests with funny library names will be interesting too... Note that when you're creating a special-purpose executable, you don't have to do the whole wait-for-inferior-to-print-something-and-then-send-CTRL-C dance. You can just have the inferior dereference a null pointer (or something of the sort) to force a stop once it has loaded the necessary shared libraries.

lldb/include/lldb/Host/common/NativeProcessProtocol.h
98–101

This should return an Expected<std::vector<...>>

lldb/packages/Python/lldbsuite/test/tools/lldb-server/TestGdbRemoteLibrariesSvr4Support.py
52–53

Why are these all caps? TBH, you probably don't need to make these variables at all. You can just place them directly into the packet as strings.

lldb/source/Plugins/Process/Linux/NativeProcessLinux.cpp
2181

This too could return an llvm::Error. Expected<std::pair<SVR4LibraryInfo, addr_t>> is a bit of a mouthful, but I'd consider that instead of by-ref returns too..

2197

What if name_buffer is not null-terminated?

lldb/source/Plugins/Process/Linux/NativeProcessLinux.h
140–146

It looks like this doesn't need to be in a header. You could just define the struct inside ReadSVR4LibraryInfo

lldb/source/Plugins/Process/gdb-remote/GDBRemoteCommunicationServerLLGS.cpp
2768

Now that we have removed ifdefs around the auxv, we should do that here too..

2778

You're completely ignoring escaping here, which is a pretty complex topic, and one of the reasons why constructing this via the DOM api is so complicated. There are a couple of considerations here:

  • ", &, and other special xml characters: these will definitely need to be xml-escaped properly
  • non-7-bit characters: how these will come out depends on whether the client&server agree on the encoding used. I'm not sure what's the best thing to do here
  • low ascii (nonprintable) characters: it looks like these cannot be represented in xml (1.0) at all (?)

It might be good to check what gdb does in these situations

aadsm marked 3 inline comments as done.Jun 4 2019, 7:39 AM
aadsm added inline comments.
lldb/packages/Python/lldbsuite/test/tools/lldb-server/TestGdbRemoteLibrariesSvr4Support.py
52–53

tbh I mostly copy pasted the auxv test :D but yeah it can be made simpler.

lldb/source/Plugins/Process/Linux/NativeProcessLinux.cpp
2197

Nicely spotted. I should add the size but this made me realize I'm not ensuring a \0 on my next diff where I implement ReadCStringFromMemory.

lldb/source/Plugins/Process/gdb-remote/GDBRemoteCommunicationServerLLGS.cpp
2778

Oh yeah, this is a path so of course that will happen. (we probably should update ds2 for that as well).

aadsm updated this revision to Diff 203083.Jun 4 2019, 11:01 PM
  • Improved the tests by using a binary that is specificly linked to other shared libs.
  • USe libxml2 to generate the name attribute (I've talked with Greg about this and he recommended to create a new class for this)
  • Address other comments
aadsm marked an inline comment as done.Jun 4 2019, 11:02 PM
aadsm added inline comments.
lldb/source/Plugins/Process/gdb-remote/GDBRemoteCommunicationServerLLGS.cpp
2778

gdb manually escapes < > " and & into their respective &code; but doesn't seem to handle characters like \r, etc.

Using the libxml function for the escaping is fine, but there are a couple of caveats. I believe people expect to be able to build lldb without libxml. At least that what all the existing xml-related code assumes (you'll see it has some #ifdefs and stuff). If we don't intend to change that, then we'll need to do something similar here. It's fine if some features (like this one) don't work without libxml, but they should degrade gracefully.

The svr reading support in the client depends on having libxml around, so depending on it here would be consistent. OTOH, lldb-server is used also in embedded(ish) environments where libxml is hard to come by (or just big), so it may make sense to try to avoid it here (particularly as you're still constructing the xml elements by hand).

So overall, it's up to you how you want to proceed here. However, if you do use libxml, I think you'll need to do something like:

  • make sure the changes in XML.h compile without libxml (move stuff into an ifdef, provide a no-op implementation for the other case, and similar)
  • only say we support the svr extension if XMLDocument::XMLEnabled() is true
  • guard the tests with @skipIfXmlSupportMissing
aadsm updated this revision to Diff 203477.Jun 6 2019, 7:51 PM

No longer depend on libxml2 to quote the attribute value

aadsm added a comment.Jun 6 2019, 7:54 PM

@labath that's a really good point. I've talked with @clayborg and @xiaobai about this and we decided it would be fine to do it manually as well so I added a XMLEncodeAttributeValue function to make it happen. I'm not 100% sure about its implementation.. Should I use a StreamString instead?

labath added a comment.Jun 7 2019, 2:27 AM

@labath that's a really good point. I've talked with @clayborg and @xiaobai about this and we decided it would be fine to do it manually as well so I added a XMLEncodeAttributeValue function to make it happen. I'm not 100% sure about its implementation.. Should I use a StreamString instead?

The implementation is fine, just make it take a StringRef instead of a char pointer.

The thing I'm not sure about is the location, as now this is the only function in XML.h that actually works without libxml support. Given that this is something we don't want to advertise too broadly, maybe you could just make it a static function GDBRemoteCommunicationServerLLGS? If this trick stays limited to lldb-server (which I think it should), then that could even be the right place for it.

aadsm updated this revision to Diff 203656.Jun 7 2019, 6:06 PM

Move XMLEncodeAttributeValue to GDBRemoteCommunicationServerLLGS.

aadsm added a comment.Jun 7 2019, 6:09 PM

Sounds good, I put it there initially because the XML.cpp set of classes abstract the libxml2 specific away but I guess it is odd to have XMLDocument::XMLEnabled() returning false and still be able to call that new function.

labath added a subscriber: mgorny.Jun 9 2019, 11:27 PM

This looks mostly good. The thing I'm wondering about is whether this part of the code is really linux specific (in which case we should rename ELFLinkMap into LinuxLinkMap), or if it can be useful on other platforms (in which case this should go into the newly-created NativeProcessELF). @mgorny , @krytarowski, what do you think?

lldb/packages/Python/lldbsuite/test/tools/lldb-server/libraries-svr4/TestGdbRemoteLibrariesSvr4Support.py
21–29 ↗(On Diff #203656)

This countinue-and-immediatelly-interrupt sequence seems very dodgy. What's the purpose of that? Given that the inferior forces a break with the null dereference, I would expect you don't need to send any interrupt packets here, just simply wait for the inferior to stop.

lldb/source/Plugins/Process/Linux/NativeProcessLinux.cpp
2181

What about that llvm::Error?

https://nxr.netbsd.org/xref/src/include/link_elf.h#9

In general this code should be close to functional on NetBSD (if not already compatible).

aadsm marked an inline comment as done.Jun 10 2019, 9:09 AM

It's the same for freebsd https://github.com/freebsd/freebsd/blob/master/sys/kern/link_elf.c#L291 although behind a GDB flag (which NetBSD doesn't seem to be: https://nxr.netbsd.org/xref/src/libexec/ld.elf_so/rtld.c#1040).

lldb/packages/Python/lldbsuite/test/tools/lldb-server/libraries-svr4/TestGdbRemoteLibrariesSvr4Support.py
21–29 ↗(On Diff #203656)

ah yeah, it will stop already on the sigsev.

aadsm marked an inline comment as done.Jun 10 2019, 10:36 AM
aadsm added inline comments.
lldb/source/Plugins/Process/Linux/NativeProcessLinux.cpp
2181

oops sorry, missed this one. I do my best to check if I addressed all the comments but sometimes I miss some :(.

aadsm updated this revision to Diff 203947.Jun 10 2019, 5:52 PM
  • Update test to just wait for the process to stop
  • Change ReadSVR4LibraryInfo signature to return an Expected to the list of libraries and stop taking params by ref.

Ok, so given that both NetBSD and FreeBSD implement this feature (and the current DynamicLoaderPOSIXDYLD plugin reads it), I think we should move this bit of code into NativeProcessELF -- after all, that's what we've created it for.

Yeah, that makes sense. I was thinking about this yesterday after checking the freebsd code. I was concerned if there's something different that I'm not aware of but we do have tests and I guess we can also override these functions if needed in the future.

aadsm updated this revision to Diff 204423.Jun 12 2019, 10:08 PM
  • Move SVR4 reading to NativeProcessELF
  • Made NetBSD and FreeBSD process implementations extend NativeProcessELF

I'm not 100% confident on this last one because I don't have a net and free bsd to test on, and I find it suspicious that the auxv packet is only enabled for linux and netbsd...

labath accepted this revision.Jun 13 2019, 1:29 AM

LGTM, with the following comment taken into account.

The ProcessFreeBSD parts are definitely wrong, as ProcessFreeBSD does not use lldb-server yet (it uses local only path). Just revert that part.

As for NetBSD, doing that might be fine, but I'd also be fine with leaving it up to the netbsd maintainers to enable this feature (by inheriting from NativeProcessELF). In other words, my request was only to make it *possible* for free/netbsd to use this code (by putting it into NativeProcessELF). I didn't mean they have to start using it *right now*.

This revision is now accepted and ready to land.Jun 13 2019, 1:29 AM

I would leave the NetBSD version as it is in this patch and let us to fix/test it later.

aadsm updated this revision to Diff 204850.Jun 14 2019, 2:46 PM

Only extend support to NetBSD

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptJun 18 2019, 10:49 AM

As I kind of expected, one of these tests was flaky because the names for the vdso pseudo-module were not adding up. When reading from /proc/%pid/maps, we got "[vdso]" (as that's how the kernel likes to call this page), but while reading the linker rendezvous structure, we got "linux-vdso.so.1" (I guess that's the SONAME of the module or something).

I "fixed" this in r363772, by restricting this test to only check the libraries that we ourselves have linked into the process.

Even after @labath's rL363772 the testcase TestGdbRemoteLibrariesSvr4Support is still FAILing on Fedora 30 x86_64, I will check it more later.